Brian Gawalt created SPARK-2336:
-----------------------------------
Summary: Approximate k-NN Models for MLLib
Key: SPARK-2336
URL: https://issues.apache.org/jira/browse/SPARK-2336
Project: Spark
Issue Type: New Feature
Components: MLlib
Reporter: Brian Gawalt
Priority: Minor
After tackling the general k-Nearest Neighbor model as per
https://issues.apache.org/jira/browse/SPARK-2335 , there's an opportunity to
also offer approximate k-Nearest Neighbor. A promising approach would involve
building a kd-tree variant within from each partition, a la
http://www.autonlab.org/autonweb/14714.html?branch=1&language=2
This could offer a simple non-linear ML model that can label new data with much
lower latency than the plain-vanilla kNN versions.
--
This message was sent by Atlassian JIRA
(v6.2#6252)