On Fri, Mar 9, 2012 at 1:50 PM, Massimo Di Stefano <massimodisa...@gmail.com> wrote: > Peter, > > really thanks for your answer. > > > > install.packages("flashClust") > library(flashClust) > data <- read.csv('/Users/epifanio/Desktop/cluster/x.txt') > data <- na.omit(data) > data <- scale(data) >> mydata > a b c d e > 1 -0.207709346 -6.618558e-01 0.481413046 0.7761133 0.96473124 > 2 -0.207709346 -6.618558e-01 0.481413046 0.7761133 0.96473124 > 3 -0.256330843 -6.618558e-01 -0.352285877 0.7761133 0.96473124 > 4 -0.289039851 -6.618558e-01 -0.370032451 -0.2838308 0.96473124 > > > my target is to group my observation by 'speciesID' > the speciesID is the last column : 'e' > > > > Before to go ahead, i should understand how to tell R that the he has to > generate the groups using the column 'e' as variable, > so to have the groups by speciesID. > > using this instruction : > > d <- dist(data) > clust <- hclust(d) > > is not clear to me how R will understand to use the column 'e' as label.
Well, you didn't say that column e was a label that you wanted to keep separate. Any other labels in the data? You may not want to use labels in the distance calculation. Do I understand right that you want to cluster each species separately? Peter ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.