For large datasets, you need hashing in order to compute k-nearest neighbors locally. You can start with LSH + k-nearest in Google scholar: http://scholar.google.com/scholar?q=lsh+k+nearest -Xiangrui
On Tue, Jan 20, 2015 at 9:55 PM, DEVAN M.S. <msdeva...@gmail.com> wrote: > Hi all, > > Please help me to find out best way for K-nearest neighbor using spark for > large data sets. > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org