Re: Query on clusterdumper output and clusteredPoints

ipshita chatterji Fri, 16 Dec 2011 01:29:21 -0800

Hi,
Thanks for the pointers. Please see my replies inline>>

You can use ClusterCountReader to find out the number of clusters in the output.
>> I have used this from 0.6 snapshot and the number of clusters matches the 
>> number of clusters generated by clusterdumper.


I think doing following things will fulfill your requirement:

1) Use 0.6-snapshot all along.
>> Used but the mismatch persists
2) Do clustering ( Note how you did it : sequentially or mapreduce way )
>> mapreduce way
3) Run ClusterOutputPostProcessorDriver ( the same way as in step 2 :
sequentially or mapreduce way ) and after that, read vectors of the
clusters
>> same as (2) above

4) Analyze whether the vectors have been clustered properly according
to your requirement.

Have I missed anything now?

Thanks,
Ipshita
On Fri, Dec 16, 2011 at 11:00 AM, Paritosh Ranjan <[email protected]> wrote:
> /"I still get a mismatch between the number of clusters generated by
> clusterdumper and after reading the members. "/
>
> You can use ClusterCountReader to find out the number of clusters in the
> output.
>
> I think doing following things will fulfill your requirement:
>
> 1) Use 0.6-snapshot all along.
> 2) Do clustering ( Note how you did it : sequentially or mapreduce way )
> 3) Run ClusterOutputPostProcessorDriver ( the same way as in step 2 :
> sequentially or mapreduce way ) and after that, read vectors of the clusters
> 4) Analyze whether the vectors have been clustered properly according to
> your requirement.
>
>
>
>
> On 15-12-2011 20:01, ipshita chatterji wrote:
>>
>> I still get a mismatch
>> between the number of clusters generated by clusterdumper and after
>> reading the members.
>
>

Re: Query on clusterdumper output and clusteredPoints

Reply via email to