Dear All,

I am applying the Ward's method to my clustering problem and have a couple
of questions. The original method tries to merge two clusters at a time
while minimizing the increase of the sum of squared errors at each step of
clustering. The error of a cluster element is the distance from the element
to the cluster centroid. The objective function is a double summation: first
computing the sum of within cluster errors for each cluster, then adding all
of them together.

The original Ward's method assumes all elements have equal weight, but in my
application element weights are not so. This difference troubles me a lot in
how to construct my initial dissimilarity matrix. Reference [1] suggests the
initial matrix entries be d_{ij}^2, the squared Euclidean distance between
any two elements i and j. And ?it is unknown what properties the resulting
clusters would have unless the similarity is the squared Euclidean distance?
on page 145.

My first question is what are these properties? I think minimizing the
increase of the sum of squared errors is the most important. What are
others?

My second question is: shall I use \frac{w_i w_j}{w_i_j} d_(ij)^2  as
matrix entries in order to keep those properties, where w_i and w_j are the
weights of elements i and j, respectively?  The factor \frac{w_i
w_j}{w_i_j} is nonlinear, and it may change a squared Euclidean distance
to something that is definitely not.  The factor does not matter for the
original Ward?s method because it equals a half when all weights are equal.
On the other hand, if I think of my unequal-weight elements as clusters of
equal-weight elements, then my initial dissimilarity matrix should be
thought of an intermediate matrix on the course of clustering equal-weight
elements. For example, if my initial data set has three elements: e1,  e2,
and e3, and their weights are 1,1, and 2, respectively, then I should be
able to think of e3 as a cluster of e6 and e7, and the weights of e6 and e7
each is 1.

Is the first step of merging questionable if I use d_{ij}^2 as matrix
entries? This is because that d_(ij)^2 is minimal does not mean so is
\frac{w_i w_j}{w_i_j} d_(ij)^2 .

Thank you so much.

-Kevin

-------------
 [1]. M. R. Anderberg, Cluster Analysis for Applications, New York: Academic
Press, 1973.

----------------------------------------------
CLASS-L list.
Instructions: http://www.classification-society.org/csna/lists.html#class-l

Reply via email to