[R] cluster analysis using Dmax

Kris Lockyear Wed, 01 Nov 2006 06:26:13 -0800

Dear All,

a long time ago I ran a cluster analysis where the dissimilarity matrix used 
consisted of Dmax (or Kolmogorov-Smirnov distance) values.  In other words 
the maximum difference between two cumulative proportion curves.  This all 
worked very well indeed.  The matrix was calculated using Dbase III+ and 
took a day and a half and the clustering was done using MV-ARCH, with the 
resultant dendrogram converted from HP Plotter language to PostScript 
manually.  As you might guess, I'd like to be able to do this more 
efficiently in R.


I have looked through the various help files and found that some of the 
clustering routines will take a dissimilarity matrix as input (yay!).

My questions (as a very novice R user) are:

a) how would one go about calculating the matrix of Dmax/KS distance values?

b) of the many clustering packages (I'll be doing a simple average link 
hierarchical clustering) is there one where I can ask: "If I 'cut' the 
dendrogram at the 0.x dissimilarity level, which items are in which  
clusters?" (As my dataset has over 200 items this is non-trivial to work out 
manually).

Many thanks indeed for your help.

Kris Lockyear.

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] cluster analysis using Dmax

Reply via email to