What I did was, in presence of equal values distances, to randomize the selection of them, and compute the distortion of the solution using cophenetic correlation. I computed 10000 "random" trees for each of three methods: average, single and complete linkage. Among the "randomly" selected solutions, for the three methods, average linkage was able to give the highest cophenetic correlation, followed by complete and then by single linkage. Among the "random" trees single linkage, for obvious reasons, gave a constant cophenetic correlation. My data set is rather small (25 objects). I'm seriously thinking of calculating all the possible solutions (I guess about 30000), picking the ones that give the highest cophenetic correlation, and analyzing the consistency among those solutions, after establishing a "natural" number of clusters.
Bruno ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help