Hi, Thanks for the pointers. Please see my replies inline>> You can use ClusterCountReader to find out the number of clusters in the output. >> I have used this from 0.6 snapshot and the number of clusters matches the >> number of clusters generated by clusterdumper.
I think doing following things will fulfill your requirement: 1) Use 0.6-snapshot all along. >> Used but the mismatch persists 2) Do clustering ( Note how you did it : sequentially or mapreduce way ) >> mapreduce way 3) Run ClusterOutputPostProcessorDriver ( the same way as in step 2 : sequentially or mapreduce way ) and after that, read vectors of the clusters >> same as (2) above 4) Analyze whether the vectors have been clustered properly according to your requirement. Have I missed anything now? Thanks, Ipshita On Fri, Dec 16, 2011 at 11:00 AM, Paritosh Ranjan <[email protected]> wrote: > /"I still get a mismatch between the number of clusters generated by > clusterdumper and after reading the members. "/ > > You can use ClusterCountReader to find out the number of clusters in the > output. > > I think doing following things will fulfill your requirement: > > 1) Use 0.6-snapshot all along. > 2) Do clustering ( Note how you did it : sequentially or mapreduce way ) > 3) Run ClusterOutputPostProcessorDriver ( the same way as in step 2 : > sequentially or mapreduce way ) and after that, read vectors of the clusters > 4) Analyze whether the vectors have been clustered properly according to > your requirement. > > > > > On 15-12-2011 20:01, ipshita chatterji wrote: >> >> I still get a mismatch >> between the number of clusters generated by clusterdumper and after >> reading the members. > >
