Re: Question about the Wards clustering method

Kejun Mei Sun, 05 Aug 2007 13:48:08 -0700

The factor is \frac{w_i w_j}{w_i+w_j}. Sorry for any inconvenience.
Thank you.


-Kejun

On 8/5/07, Kejun (Kevin) Mei <[EMAIL PROTECTED]> wrote:
> Dear All,
>
> I am applying the Ward's method to my clustering problem and have a couple
> of questions. The original method tries to merge two clusters at a time
> while minimizing the increase of the sum of squared errors at each step of
> clustering. The error of a cluster element is the distance from the element
> to the cluster centroid. The objective function is a double summation: first
> computing the sum of within cluster errors for each cluster, then adding all
> of them together.
>
> The original Ward's method assumes all elements have equal weight, but in my
> application element weights are not so. This difference troubles me a lot in
> how to construct my initial dissimilarity matrix. Reference [1] suggests the
> initial matrix entries be d_{ij}^2, the squared Euclidean distance between
> any two elements i and j. And ?it is unknown what properties the resulting
> clusters would have unless the similarity is the squared Euclidean distance?
> on page 145.
>
> My first question is what are these properties? I think minimizing the
> increase of the sum of squared errors is the most important. What are
> others?
>
> My second question is: shall I use \frac{w_i w_j}{w_i_j} d_(ij)^2  as
> matrix entries in order to keep those properties, where w_i and w_j are the
> weights of elements i and j, respectively?  The factor \frac{w_i
> w_j}{w_i_j} is nonlinear, and it may change a squared Euclidean distance
> to something that is definitely not.  The factor does not matter for the
> original Ward?s method because it equals a half when all weights are equal.
> On the other hand, if I think of my unequal-weight elements as clusters of
> equal-weight elements, then my initial dissimilarity matrix should be
> thought of an intermediate matrix on the course of clustering equal-weight
> elements. For example, if my initial data set has three elements: e1,  e2,
> and e3, and their weights are 1,1, and 2, respectively, then I should be
> able to think of e3 as a cluster of e6 and e7, and the weights of e6 and e7
> each is 1.
>
> Is the first step of merging questionable if I use d_{ij}^2 as matrix
> entries? This is because that d_(ij)^2 is minimal does not mean so is
> \frac{w_i w_j}{w_i_j} d_(ij)^2 .
>
> Thank you so much.
>
> -Kevin
>
> -------------
>  [1]. M. R. Anderberg, Cluster Analysis for Applications, New York: Academic
> Press, 1973.
>
>
>
>
>

----------------------------------------------
CLASS-L list.
Instructions: http://www.classification-society.org/csna/lists.html#class-l

Re: Question about the Wards clustering method

Reply via email to