On 18 Apr 2005, at 19:10, Tyler Smith wrote:
Hello,
I am a relatively new user of R. I have written a basic function to
calculate
the Gower similarity function. I was motivated to do so partly as an
excercise
in learning R, and partly because the existing option (vegdist in the
vegan
package) does not accept missing values.
Speed is the reason to use C instead of R. It should be easy, almost
trivial, to modify the vegdist.c so that it handles missing values. I
guess this handling means ignoring the value pair if one of the values
is missing -- which is not so gentle to the metric properties so dear
to Gower. Package vegan is designed for ecological community data which
generally do not have missing values (except in environmental data),
but contributions are welcome.
I think I have succeeded - my function gives me the correct values.
However, now
that I'm starting to use it with real data, I realise it's very slow.
It takes
more than 45 minutes on my Windows 98 machine (R 2.0.1 Patched
(2005-03-29))
with a 185x32 matrix with ca 100 missing values. If anyone can suggest
ways to
speed up my function I would appreciate it. I suspect having a pair of
nested
for loops is the problem, but I couldn't figure out how to get rid of
them.
cheers, jari oksanen
--
Jari Oksanen, Oulu, Finland
______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html