I also use NamedVectors to point vectors to input objects, and I don't think there is any other way to do it.

On 12-09-2012 07:03, Pat Ferrel wrote:
Maybe I should reword this since it has nothing to do with SSVD.

When doing clustering and asking the driver to cluster the input vectors after 
the clusters are computed it creates a file called clusteredPoints/part-m-xxx

In it are cluster IDs and input vector pairs (IntWritable, VectorWritable). 
When you use NamedVectors as input vectors, the NamedVectors are stored in 
clusteredPoints so you can use the names to identify the classified vectors.

However if you do not create NamedVectors, there appears to be no way to 
identify the classified VectorWritables in clusteredPoints? Unless I missed 
something there is no way to tie the classified vectors to input objects (docs 
in my case).

Do I need to create my own classification to get the row ids associated with 
clusters?


Reply via email to