clustering code has a paramater that enables or disables whether the cluster-point assignments need to be generated. If set, it will create a folder called clusteredPoints in the output directory having a sequence file with mappings
Robin On Tue, Feb 15, 2011 at 6:02 AM, Kidong Lee <[email protected]> wrote: > Hi, > > My situation is almost like '12.1 Finding similar users on Twitter' in > Mahout in action book. > > In my document, there are lists of item id and its contents seperated by > delimiter comma, for example like this CSV file(itemId, itemContents): > 1223, sports > 1344, football nike > ... > > First I did convert this csv file to sequence file, and vectorized the > sequence file with SparseVectorsFromSequenceFiles. > With kmeans clustering, I got the clusters. Until this, all the things > fine. > > I wanted to get the list of items which belong to a cluster, but I have no > idea how. > I have printed the entries using cluster-dumper, but there is no info about > the item id. > > Any idea how to get the list of item id which belong to a cluster? > > - Kidong. >
