Re: Query on clusterdumper output and clusteredPoints

ipshita chatterji Wed, 14 Dec 2011 19:08:19 -0800

Actually clustering was done using 0.5 version of mahout but I am
using the clusterterdumper code from current version of mahout present
in "trunk" to analyze the clusters. To make it run I renamed the final
cluster by appending "-final".
I got the OOM error even after increasing the mahout heapsize and
hence had written a code of my own to analyze the clusters by reading
"-clusteredPoints".


Thu, Dec 15, 2011 at 2:58 AM, Gary Snider <[email protected]> wrote:
> Ok.  See if you can get the --pointsDir working and post what you get.  Also 
> for seqFileDir do you have a directory with the word 'final' in it?
>
> On Dec 14, 2011, at 12:37 PM, ipshita chatterji <[email protected]> wrote:
>
>> For clusterdumper I had following commandline:
>>
>> $MAHOUT_HOME/bin/mahout clusterdump --seqFileDir output/clusters-6
>> --output clusteranalyze.txt
>>
>> Have written a separate program to read clusteredOutput directory as
>> clusterdumper with "--pointsDir output/clusteredPoints " was giving
>> OOM exception.
>>
>> Thanks
>>
>> On Wed, Dec 14, 2011 at 10:06 PM, Gary Snider <[email protected]> 
>> wrote:
>>> What was on your command line?  e.g. seqFileDir, pointsDir, etc
>>>
>>> On Wed, Dec 14, 2011 at 10:54 AM, ipshita chatterji 
>>> <[email protected]>wrote:
>>>
>>>> Hi,
>>>>
>>>> I am a newbie in Mahout and also have elementary knowledge of
>>>> clustering. I managed to cluster my data using meanshift and then ran
>>>> clusterdumper, I get following output:
>>>>
>>>> MSV-21{n=1 c=[1:0...........]
>>>>
>>>> So I asssume that the cluster above has converged and n=1 indicates
>>>> that there is only one point associated with the cluster above.
>>>>
>>>> Now I try to read the members of this cluster from "clusteredPoints"
>>>> directory. I see from the output that number of points belonging this
>>>> cluster is 173.
>>>>
>>>> Why is this mismatch happening? Am I missing something here?
>>>>
>>>> Thanks,
>>>> Ipshita
>>>>

Re: Query on clusterdumper output and clusteredPoints

Reply via email to