Hi All, A question from a newbie using R 2-5-0 on windows XP. Why is it that kmeans clustering with apparently the exact same parameters behaves so differently between the two following examples :
> cl1 <- kmeans(subset(pointsUXO15555, select = c(2:4)), 10) Takes about 2 seconds to deliver a result > cl1 <- clust(subset(pointsUXO15555, select = c(2:4)), k=10, method="kmeansHartigan") Dies after about 10 minutes and fills up RAM : *** running kmeansHartigan cluster algorithm... *** calculating validity measure... Erreur : impossible d'allouer un vecteur de taille 922.9 Mo De plus : Warning messages: 1: Reached total allocation of 1023Mb: see help(memory.size) 2: Reached total allocation of 1023Mb: see help(memory.size) 3: Reached total allocation of 1023Mb: see help(memory.size) 4: Reached total allocation of 1023Mb: see help(memory.size) If I understand correctly, both methods should give the sameish results (modulo the initial random locations) since the default in kmeans is "Hartigan-Wong". My data frame is 3 columns X 15555 lines. It must be that kmeans is more a "core" R function whereas clust id from the clustTool package, but isn't clustTool simply wrapping the core kmeans method ? Why such a difference ? TIA, Yves Moisan ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.