Actually clustering was done using 0.5 version of mahout but I am using the clusterterdumper code from current version of mahout present in "trunk" to analyze the clusters. To make it run I renamed the final cluster by appending "-final". I got the OOM error even after increasing the mahout heapsize and hence had written a code of my own to analyze the clusters by reading "-clusteredPoints".
Thu, Dec 15, 2011 at 2:58 AM, Gary Snider <[email protected]> wrote: > Ok. See if you can get the --pointsDir working and post what you get. Also > for seqFileDir do you have a directory with the word 'final' in it? > > On Dec 14, 2011, at 12:37 PM, ipshita chatterji <[email protected]> wrote: > >> For clusterdumper I had following commandline: >> >> $MAHOUT_HOME/bin/mahout clusterdump --seqFileDir output/clusters-6 >> --output clusteranalyze.txt >> >> Have written a separate program to read clusteredOutput directory as >> clusterdumper with "--pointsDir output/clusteredPoints " was giving >> OOM exception. >> >> Thanks >> >> On Wed, Dec 14, 2011 at 10:06 PM, Gary Snider <[email protected]> >> wrote: >>> What was on your command line? e.g. seqFileDir, pointsDir, etc >>> >>> On Wed, Dec 14, 2011 at 10:54 AM, ipshita chatterji >>> <[email protected]>wrote: >>> >>>> Hi, >>>> >>>> I am a newbie in Mahout and also have elementary knowledge of >>>> clustering. I managed to cluster my data using meanshift and then ran >>>> clusterdumper, I get following output: >>>> >>>> MSV-21{n=1 c=[1:0...........] >>>> >>>> So I asssume that the cluster above has converged and n=1 indicates >>>> that there is only one point associated with the cluster above. >>>> >>>> Now I try to read the members of this cluster from "clusteredPoints" >>>> directory. I see from the output that number of points belonging this >>>> cluster is 173. >>>> >>>> Why is this mismatch happening? Am I missing something here? >>>> >>>> Thanks, >>>> Ipshita >>>>
