Re: [R] millions of comparisons, speed wanted

Adrian DUSA Sat, 17 Dec 2005 11:01:08 -0800

The daisy function is _very_ good!
I have been able to use it for nominal variables as well, simply by:
daisy(input)*ncol(input)

Now, for very large number of rows (say 5000), daisy works for about 3
minutes using the swap space. I probably need more RAM (only 512 on my
computer). But at least I get a result... :)

For relatively small input matrices, it increased the speed by a
factor of 3. Way to go!

Best,
Adrian

On 12/16/05, Martin Maechler <[EMAIL PROTECTED]> wrote:
> I have not taken the time to look into this example,
> but
>        daisy()
> from the (recommended, hence part of R) package 'cluster'
> is more flexible than dist(), particularly in the case of NAs
> and for (a mixture of continuous and) categorical variables.
>
> It uses a version of Gower's formula in order to deal with NAs
> and asymmetric binary variables.  The example below look like
> very well matching to this problem.
>
> Regards,
> Martin Maechler, ETH Zurich

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] millions of comparisons, speed wanted

Reply via email to