>>> stopwords inside) - maybe it is a question on its own - how can I easily
go
>>> back from clusters->original docs (an not just vectors), I do not know
>>> maybe
>>> some kind of mapper which maps vectors to the original documents somehow
>>> (e.g. sort of URL for a document based on the vector id/index or
>>> something?).
>>>
>>
>> To do this, you should use the document ID and just return the original
>> content from some other content store.  Lucene or especially SOLR can
help
>> with this.

> Right, Mahout's vector can take labels.

what do you mean by using the document ID and that vectors can take labels?
is it something I could use right away from the current cluster vectors of
should I change some Mahout code to get to the documents ID?

Reply via email to