Dear Chris: I tried to use cor+1 but it still gives me sil width < 0 for average.
> set.seed(1000) > t9 <- cor(t(x), method="pearson")+1 # here i add 1 > t8 <- as.dist(t9) > t7 <- cutree(hclust(t8), 4) > cluster.stats(t8, t7)$avg.silwidth [1] -0.008750826 > set.seed(1000) > t9 <- cor(t(x), method="pearson") # here I did not add 1 > t8 <- as.dist(t9) > t7 <- cutree(hclust(t8), 4) > cluster.stats(t8, t7)$avg.silwidth [1] -0.09543089 On 10/18/06, Weiwei Shi <[EMAIL PROTECTED]> wrote: > Dear Chris: > > thanks for the prompt reply! > > You are right, dist from pearson has negatives there, which I should > use cor+1 in my case (since negatively correlated genes should be > considered farthest). Thanks. > > as to the ?cluster.stats, I double-checked it and I found I need to > restart my JGR, until then the help page function starts to accept > newly loaded package, like fpc for this case. > > sorry for the confusion, > > weiwei > > On 10/18/06, Christian Hennig <[EMAIL PROTECTED]> wrote: > > Dear Weiwei, > > > > > btw, ?cluster.stats does not work on my Mac machine. > > >> version > > > _ > > > platform i386-apple-darwin8.6.1 > > > arch i386 > > > os darwin8.6.1 > > > system i386, darwin8.6.1 > > > status > > > major 2 > > > minor 3.1 > > > year 2006 > > > month 06 > > > day 01 > > > svn rev 38247 > > > language R > > > version.string Version 2.3.1 (2006-06-01) > > > > Because I don't have access to a Mac, I can't tell you anything about > > this, unfortunately. > > I always thought that my package should work on all platforms if it passes > > all the standard tests for packages? > > (Is there anyone else who could comment on this please?) > > > > > I have a sample like this > > >> dim(dd.df) > > > [1] 142 28 > > > > > > and I want to cluster rows; > > > first of all, I followed the examples for cluster.stats() by > > > d.dd <- dist(dd.df) # use Euclidean > > > d.4 <- cutree(hclust(d.dd), 4) # 4 clusters I tried > > > cluster.stats(d.dd, d.4) # gives me some results like this: > > > > > > $cluster.size > > > [1] 133 5 2 2 > > > > > > $avg.silwidth > > > [1] 0.9857916 > > > > > > but when I tried to use pearson dist here, by visualization, i think 4 > > > or 5 clusters are good for pearson dist, but it gave me a very bad > > > avg.siqlwidth > > > > > > d.dd <- as.dist(cor(t(x),method="pearson")) # is This correct? > > > $cluster.size > > > [1] 86 31 6 19 > > > > > > $avg.silwidth > > > [1] -0.09543089 > > > > cor can give negative values, which doesn't fit the usual definition > > of a distance. I don't know what as.dist does in this case, but I think > > that, depending on your application, you should rather use the absolute > > value of the correlation, or 1+cor. > > > > > btw, what's $seperation? where can I find the detailed explanation on > > > the output from cluster.stats? > > > > This is documented on the cluster.stats help page: > > > > separation: vector of clusterwise minimum distances of a point in the > > cluster to a point of another cluster. > > > > Best regards, > > Christian > > > > > > *** --- *** > > Christian Hennig > > University College London, Department of Statistical Science > > Gower St., London WC1E 6BT, phone +44 207 679 1698 > > [EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche > > > > > -- > Weiwei Shi, Ph.D > Research Scientist > GeneGO, Inc. > > "Did you always know?" > "No, I did not. But I believed..." > ---Matrix III > -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. "Did you always know?" "No, I did not. But I believed..." ---Matrix III ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
