[
https://issues.apache.org/jira/browse/MAHOUT-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659829#comment-13659829
]
Suneel Marthi edited comment on MAHOUT-1217 at 5/16/13 7:02 PM:
----------------------------------------------------------------
a) This issue is not related to the Distance Measure that's being used.
b) It happens with both FastProjection Search and Projection Search. I suspect
its got to do with the Array sort that's happening in FastProjection Search.
Here's the exception:
{Code}
java.lang.RuntimeException: Unable to remove centroid
at
org.apache.mahout.clustering.streaming.cluster.StreamingKMeans.clusterInternal(StreamingKMeans.java:310)
at
org.apache.mahout.clustering.streaming.cluster.StreamingKMeans.cluster(StreamingKMeans.java:212)
at
org.apache.mahout.clustering.streaming.cluster.StreamingKMeans.cluster(StreamingKMeans.java:221)
at
org.apache.mahout.clustering.streaming.mapreduce.StreamingKMeansMapper.map(StreamingKMeansMapper.java:56)
at
org.apache.mahout.clustering.streaming.mapreduce.StreamingKMeansMapper.map(StreamingKMeansMapper.java:31)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.mapred.Child.main(Child.java:253)
{Code}
was (Author: smarthi):
a) This issue is not related to the Distance Measure that's being used.
b) It only happens in FastProjection Search, works fine with Projection Search.
I suspect its got to do with the Array sort that's happening in FastProjection
Search.
> Nearest neighbor searchers sometimes fail to remove points
> ----------------------------------------------------------
>
> Key: MAHOUT-1217
> URL: https://issues.apache.org/jira/browse/MAHOUT-1217
> Project: Mahout
> Issue Type: Bug
> Components: Math
> Affects Versions: 0.8
> Reporter: Dan Filimon
>
> When updating a Centroid in StreamingKMeans, the Centroid needs to be removed
> and its updated version added.
> When removing points in a searcher that are already there, sometimes the
> searcher fails to return the closest point (the one being searched for)
> causing a RuntimeException.
> This has been observed for TF-IDF vectors with SquaredEuclideanDistance and
> CosineDistance and FastProjectionSearch.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira