If it stopped working I would feel confident calling that a bug.  The KMeans
algorithm should forward the vectors in an "as is" manner.
On Aug 13, 2011 6:30 PM, "Lance Norskog" <[email protected]> wrote:
> "The NVs will flow through the clustering step into the
> clusteredPoints directory." Be careful about this part. It is hard to
> guarantee that this will always work, and will keep working as classes
> evolve.
>
> On Fri, Aug 12, 2011 at 12:35 PM, Eshwaran Vijaya Kumar
> <[email protected]> wrote:
>> Excellent..NamedVectors would do the job. Thanks.
>> On Aug 12, 2011, at 12:09 PM, Jeff Eastman wrote:
>>
>>> KMeans does not use the key in its mapper, only the VectorWritable
value. But you can create NamedVectors in your upstream processing and put
the IDs in the name and the Vectors in the delegate. The NVs will flow
through the clustering step into the clusteredPoints directory. You will
have to write your own clustering step if you want a different output than
the WVWs.
>>>
>>> -----Original Message-----
>>> From: Eshwaran Vijaya Kumar [mailto:[email protected]]
>>> Sent: Friday, August 12, 2011 11:44 AM
>>> To: [email protected]
>>> Subject: Mahout KMeans Output
>>>
>>> I am using KMeans as part of a long pipeline. Suppose I give Kmeans a
SequenceFile containing Key as IntWritable and value as VectorWritable where
the Keys are IDs for the Vectors, is there a utility or an option to get
KMeans to spit out the IDs that belong to a cluster rather than the
WeightedVectorWritable bean?
>>>
>>> Thanks
>>> Esh
>>
>>
>
>
>
> --
> Lance Norskog
> [email protected]

Reply via email to