clustering code has a paramater that enables or disables whether the
cluster-point assignments need to be generated. If set, it will create a
folder called clusteredPoints in the output directory having a sequence file
with mappings

Robin

On Tue, Feb 15, 2011 at 6:02 AM, Kidong Lee <[email protected]> wrote:

> Hi,
>
> My situation is almost like '12.1 Finding similar users on Twitter' in
> Mahout in action book.
>
> In my document, there are lists of item id and its contents seperated by
> delimiter comma, for example like this CSV file(itemId, itemContents):
> 1223, sports
> 1344, football nike
> ...
>
> First I did convert this csv file to sequence file, and vectorized the
> sequence file with SparseVectorsFromSequenceFiles.
> With kmeans clustering, I got the clusters. Until this, all the things
> fine.
>
> I wanted to get the list of items which belong to a cluster, but I have no
> idea how.
> I have printed the entries using cluster-dumper, but there is no info about
> the item id.
>
> Any idea how to get the list of item id which belong to a cluster?
>
> - Kidong.
>

Reply via email to