Hi,

I'm using heatmap.2 to cluster my data, using the centroid method for 
clustering and the maximum method for calculating the distance matrix:

library("gplots")
library("RColorBrewer")

test <- matrix(c(0.96, 0.07, 0.97, 0.98, 0.50, 0.28, 0.29, 0.77,
                 0.08, 0.96, 0.51, 0.51, 0.14,  0.19, 0.41, 0.51),
               ncol=4, byrow=TRUE)
colnames(test) <- c("Exp1","Exp2","Exp3","Exp4")
rownames(test) <- c("Gene1","Gene2","Gene3", "Gene4")
test <- as.table(test)
mat = data.matrix(test)

heatmap.2(mat, dendrogram="row", Rowv=TRUE,
    Colv=FALSE, distfun = function(x) dist(x,method = 'maximum'),
    hclustfun = function(x) hclust(x,method = 'centroid'),
    xlab = NULL, ylab = NULL, key=TRUE,
    keysize=1, trace="none", density.info=c("none"),
    margins=c(6, 12), col=bluered
)

This gives a heatmap with inversions in the cluster tree, which is inherent to 
the centroid method. A solution to avoid inversions is to use the Euclidean or 
the city-block distance, and indeed if you change maximum to euclidean in the 
above example the inversions are gone.(for reference see chapter 4.1.1 in this 
link<http://bonsai.hgc.jp/%7Emdehoon/software/cluster/manual/Hierarchical.html>)

Now as for my problem, when I use my actual data instead of this example table 
the inversions are still there when I change to euclidean. The R code is 
exactly the same as in this example, only the data is different. When I use 
cluster 3.0 and java treeview with the euclidean and centroid method there are 
no inversions in my data as expected. So why does R give inversions? The theory 
and other software says it shouldn't.

Here is an example were changing maximum to euclidean does not fix inversions 
(as opposed to the above example were it did fix it)

library("gplots")
library("RColorBrewer")

test <- matrix(c(0.96, 0.07, 0.97, 0.98, 0.99, 0.50, 0.28, 0.29, 0.77, 0.78, 
0.08, 0.96, 0.51, 0.51, 0.55, 0.14, 0.19, 0.41, 0.51, 0.40, 0.97, 0.98, 0.99, 
0.50, 0.28),ncol=6,byrow=TRUE)
colnames(test) <- c("Exp1","Exp2","Exp3","Exp4","Exp5","Exp6")
rownames(test) <- c("Gene1","Gene2","Gene3", "Gene4")
test <- as.table(test)
mat=data.matrix(test)

heatmap.2(mat, dendrogram="row", Rowv=TRUE,
Colv=FALSE, distfun = function(x) dist(x,method = 'maximum'),
hclustfun = function(x) hclust(x,method = 'centroid'),
xlab = NULL, ylab = NULL, key=TRUE,
keysize=1, trace="none", density.info=c("none"),
margins=c(6, 12), col=bluered
)

Do you have any idea what could be the cause of this discrepancy?

Kind regards

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to