[R] PAM clustering: using my own dissimilarity matrix

Hans Kï¿½rber Tue, 29 Jun 2004 09:24:05 -0700

Hello,

I would like to use my own dissimilarity matrix in a PAM clustering with method "pam" (cluster package) instead of a dissimilarity matrix created by daisy.

I read data from a file containing the dissimilarity values using "read.csv". This creates a matrix (alternatively: an array or vector) which is not accepted by "pam": A call

   p<-pam(d,k=2,diss=TRUE)

yields an error message "Error in pam(d, k = 2, diss = TRUE) : x is not of class dissimilarity and can not be converted to this class." How can I convert the matrix d into a dissimilarity matrix suitable for "pam"?

I'm aware of a response by Friedrich Leisch to a similar question posed by Jose Quesada (quoted below). But as I understood the answer, the dissimilarity matrix there is calculated on the basis of (random) data.

Thank you in advance.
Hans

__________________________________

/>>>>> On Tue, 09 Jan 2001 15:42:30 -0700, /
/>>>>> Jose Quesada (JQ) wrote: /

/ > Hi, / / > I'm trying to use a similarity matrix (triangular) as input for pam() or / / > fanny() clustering algorithms. / / > The problem is that this algorithms can only accept a dissimilarity / / > matrix, normally generated by daisy(). /

/ > However, daisy only accept 'data matrix or dataframe. Dissimilarities / / > will be computed between the rows of x'. / / > Is there any way to say to that your data are already a similarity / / > matrix (triangular)? / / > In Kaufman and Rousseeuw's FORTRAN implementation (1990), they showed an / / > option like this one: /

/ > "Maybe you already have correlations coefficients between variables. /
/ > Your input data constist on a lower triangular matrix of pairwise /
/ > correlations. You wish to calculate dissimilarities between the /
/ > variables." /

/ > But I couldn't find this alternative in the R implementation. /

/ > I can not use foo <- as.dist(foo), neither daisy(foo...) because /
/ > "Dissimilarities will be computed between the rows of x", and this is /
/ > not /
/ > what I mean. /

/ > You can easily transform your similarities into dissimilarities like /
/ > this (also recommended in Kaufman and Rousseeuw ,1990): /

/ > foo <- (1 - abs(foo)) # where foo are similarities /

/ > But then pam() will complain like this: /

/ > " x is not of class dissimilarity and can not be converted to this /
/ > class." /

/ > Can anyone help me? I also appreciate any advice about other clustering / / > algorithms that can accept this type of input. /

Hmm, I don't understand your problem, because proceeding as the docs
describe it works for me ...

If foo is a similarity matrix (with 1 meaning identical objects), then

bar <- as.dist(1 - abs(foo))
fanny(bar, ...)

works for me:

## create a random 12x12 similarity matrix, make it symmetric and set the
## diagonal to 1
/> x <- matrix(runif(144), nc=12) /
/> x <- x+t(x) /
/> diag(x) <- 1 /

## now proceed as described in the docs
/> y <- as.dist(1-x) /
/> fanny(y, 3) /
iterations objective
42.000000 3.303235
Membership coefficients:
       [,1] [,2] [,3]
1 0.3333333 0.3333333 0.3333333
2 0.3333333 0.3333333 0.3333333
3 0.3333334 0.3333333 0.3333333
4 0.3333333 0.3333333 0.3333333
...

______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] PAM clustering: using my own dissimilarity matrix

Reply via email to