[ https://issues.apache.org/jira/browse/MAHOUT-668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037504#comment-13037504 ]
Ted Dunning commented on MAHOUT-668: ------------------------------------ {quote} Your right. The distance metrics will have trouble with Random Vectors. I'll work on a fix for that. (The code is on the critical path, I can't afford to lose the speed of the current method and the other vector methods give incorrect results for missing=0 vectors) {quote} Sparse vectors in Mahout assume that missing elements are 0. Are you saying that you want to consider missing elements as something other than 0? Your javadoc didn't seem to say that. You should get the same results either way. > Adding knn support to Mahout classifiers > ---------------------------------------- > > Key: MAHOUT-668 > URL: https://issues.apache.org/jira/browse/MAHOUT-668 > Project: Mahout > Issue Type: Improvement > Components: Classification > Affects Versions: 0.6 > Reporter: Daniel McEnnis > Labels: classification, knn > Attachments: MAHOUT-668.pat, Mahout-668-2.patch, Mahout-668-3.patch, > Mahout-668.pat > > Original Estimate: 672h > Remaining Estimate: 672h > > Initial implementation of the knn. This is a minimum base set with many more > possible add-ons including support for text and weka input as well as a > classify only (no confusion matrix) back end. The system was tested on the > 20 newsgroup data set. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira