[ 
https://issues.apache.org/jira/browse/MAHOUT-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13399375#comment-13399375
 ] 

Gaurav Redkar commented on MAHOUT-966:
--------------------------------------

yeah i can try to look into thjs issue. I want a clarification regarding the 
difference between the variables "numPoints" and "boundPoints"  as mentioned in 
my previous comment above. 

The point to note is that the size of "boundPoints" ("boundpoints" is a list of 
points belonging to a cluster) that i tried to print by tweaking the 
clusterdumper code actually matched the number of points printed in each 
cluster. so could it be that the "numPoints" was not properly calculated at the 
end of last iteration before the algorithm terminates..? It is just a guess. I 
will try to look deeper into it.
                
> Mismatch in the number of points given by the clusterDumper and 
> ClusterOutputPostProcessor
> ------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-966
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-966
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.6
>         Environment: hadoop 0.20.2 mahout 0.6 
>            Reporter: Gaurav Redkar
>            Priority: Minor
>         Attachments: cluster-dumper-output.txt, clusterpp-output.txt, 
> mtestdata.txt, points100dCCNorm.txt
>
>
>  After running the post processor the number of points that each cluster 
> contains is not matching the number of points each cluster should contain as 
> stated by clusterdumper.
>  
> MSV-287{ n=90 c=[0.05195, 0.05675, 0.07151, 0.05713, 0.06946,...}
> MSV-145{ n=90 c=[0.93685, 0.93071, 0.93641, 0.94629, 0.94409,..}
> the n mentioned in clusters-n-final against each cluster is different from 
> the number of points actually contained in d directory for each cluster. Any 
> idea why is this happening ...?  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to