Re: Clustering techniques, tips and tricks

Drew Farris Tue, 05 Jan 2010 18:44:23 -0800

Each iteration of kmeans procuses a cluster-X folder, with X starting
at 0. You would get clusters-0 in cases where the clusters converge
after the first run.

Whether your clusters will retain document id's is based on how you
create the vectors. For example, the lucene vector dumper can be told
to extract the value from a specific field in the index to use for the
vector labels. These are carried through to the points file produced
at the end of the k-means run.

On Tue, Jan 5, 2010 at 9:36 PM, Bogdan Vatkov <[email protected]> wrote:
> Is there some description of the content of the cluster vector?
> I also noticed that I end up with some folders clusters-0 and clusters-1,
> but sometimes it is only clusters-0, when do we get the different folders
> and which should be used as end result - e.g. by the ClusterDumper?

Re: Clustering techniques, tips and tricks

Reply via email to