The daisy function is _very_ good! I have been able to use it for nominal variables as well, simply by: daisy(input)*ncol(input)
Now, for very large number of rows (say 5000), daisy works for about 3 minutes using the swap space. I probably need more RAM (only 512 on my computer). But at least I get a result... :) For relatively small input matrices, it increased the speed by a factor of 3. Way to go! Best, Adrian On 12/16/05, Martin Maechler <[EMAIL PROTECTED]> wrote: > I have not taken the time to look into this example, > but > daisy() > from the (recommended, hence part of R) package 'cluster' > is more flexible than dist(), particularly in the case of NAs > and for (a mixture of continuous and) categorical variables. > > It uses a version of Gower's formula in order to deal with NAs > and asymmetric binary variables. The example below look like > very well matching to this problem. > > Regards, > Martin Maechler, ETH Zurich ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html