What I did was, in presence of equal values distances, to randomize the
selection of them, and compute the distortion of the solution using
cophenetic correlation.
I computed 1 "random" trees for each of three methods: average, single
and complete linkage.
Among the "randomly" selected solution
Hi,
Brian Ripley already replied "don't use average linkage"... You
may think about k-medoid (pam) in package cluster instead.
However, often average linkage is not such a bad choice, and if you really
want to use it for your data, you may try the following:
Among the hierarchical methods, single
On Wed, 3 Dec 2003, Bruno Giordano wrote:
> Hi,
> I'm clustering objects defined by categorical variables with a hierarchical
> algorithm - average linkage.
> My distance matrix (general dissimilarity coefficient) includes several
> distances with exactly the same values.
> As I see, a standard ag
Bruno -
Many people add a tiny random number to each of the distances,
or deliberately randomize the input order. This means that
any clustering is not reproducible, unless you go back to the
original randoms, but it forces you not to pay attention to
minor differences.
Ah, I think you're askin
Hi,
I'm clustering objects defined by categorical variables with a hierarchical
algorithm - average linkage.
My distance matrix (general dissimilarity coefficient) includes several
distances with exactly the same values.
As I see, a standard agglomerative procedure ignores this problems, simply
sel