Hi, My situation is almost like '12.1 Finding similar users on Twitter' in Mahout in action book.
In my document, there are lists of item id and its contents seperated by delimiter comma, for example like this CSV file(itemId, itemContents): 1223, sports 1344, football nike ... First I did convert this csv file to sequence file, and vectorized the sequence file with SparseVectorsFromSequenceFiles. With kmeans clustering, I got the clusters. Until this, all the things fine. I wanted to get the list of items which belong to a cluster, but I have no idea how. I have printed the entries using cluster-dumper, but there is no info about the item id. Any idea how to get the list of item id which belong to a cluster? - Kidong.
