[ 
https://issues.apache.org/jira/browse/MAHOUT-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659829#comment-13659829
 ] 

Suneel Marthi edited comment on MAHOUT-1217 at 5/16/13 7:02 PM:
----------------------------------------------------------------

a) This issue is not related to the Distance Measure that's being used.
b) It happens with both FastProjection Search and Projection Search. I suspect 
its got to do with the Array sort that's happening in FastProjection Search. 
Here's the exception:

{Code}

java.lang.RuntimeException: Unable to remove centroid
        at 
org.apache.mahout.clustering.streaming.cluster.StreamingKMeans.clusterInternal(StreamingKMeans.java:310)
        at 
org.apache.mahout.clustering.streaming.cluster.StreamingKMeans.cluster(StreamingKMeans.java:212)
        at 
org.apache.mahout.clustering.streaming.cluster.StreamingKMeans.cluster(StreamingKMeans.java:221)
        at 
org.apache.mahout.clustering.streaming.mapreduce.StreamingKMeansMapper.map(StreamingKMeansMapper.java:56)
        at 
org.apache.mahout.clustering.streaming.mapreduce.StreamingKMeansMapper.map(StreamingKMeansMapper.java:31)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
        at org.apache.hadoop.mapred.Child.main(Child.java:253)

{Code}
                
      was (Author: smarthi):
    a) This issue is not related to the Distance Measure that's being used.
b) It only happens in FastProjection Search, works fine with Projection Search. 
I suspect its got to do with the Array sort that's happening in FastProjection 
Search.
                  
> Nearest neighbor searchers sometimes fail to remove points
> ----------------------------------------------------------
>
>                 Key: MAHOUT-1217
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1217
>             Project: Mahout
>          Issue Type: Bug
>          Components: Math
>    Affects Versions: 0.8
>            Reporter: Dan Filimon
>
> When updating a Centroid in StreamingKMeans, the Centroid needs to be removed 
> and its updated version added.
> When removing points in a searcher that are already there, sometimes the 
> searcher fails to return the closest point (the one being searched for) 
> causing a RuntimeException.
> This has been observed for TF-IDF vectors with SquaredEuclideanDistance and 
> CosineDistance and FastProjectionSearch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to