Re: Prediction in clustering

Art Kendall Tue, 26 Jul 2005 12:44:28 -0700

There are methods in many packages. That is the essential difference between classification and clustering.
For example,
Since the mid-70's, SPSS's Discriminant Function Analysis procedure has had the ability to have some cases (rows) in the data file be "unclassified". After it finds the discriminating functions, it applies the function to the unclassified cases and assigns them to the closest group, giving also the probability of membership of a case for each group and a probability of a case that is assigned to this group being so far from the centroid.

Many of the other procedures in SPSS (e.g., Quick Cluster (k-means), TWOSTEP, TREE, REGRESSION, etc.) have explicit options to save a model so that the model can be applied to cases that were not used in formulating the model.

P.S. the TWOSTEP clustering procedure in SPSS gives an AIC and BIC for numbers of clusters in a requested range. This can help you in determining the number to retain.

Art
[EMAIL PROTECTED]
Social Research Consultants
University Park, MD USA
(301) 864-5570

Jay Liu wrote:

Dear all,

Apart from how to determine the number of clusters, another difficulty

in clustering (I think) is how to predict cluster memberships of new

data. This is very straight forward in classification but I can't think

of a single clustering method I know can do this. I guess some

model-based techniques maybe can do this but frankly, I have no clue at all.

Jay.

Re: Prediction in clustering

Reply via email to