The implementation closely aligns with jaccard. It should be possible to swap out the hash functions to a family that is compatible with other distance measures.
> On Dec 22, 2014, at 1:16 AM, Nick Pentreath <nick.pentre...@gmail.com> wrote: > > Looks interesting thanks for sharing. > > Does it support cosine similarity ? I only saw jaccard mentioned from a quick > glance. > > — > Sent from Mailbox <https://www.dropbox.com/mailbox> > > On Mon, Dec 22, 2014 at 4:12 AM, morr0723 <michael.d....@gmail.com > <mailto:michael.d....@gmail.com>> wrote: > > I've pushed out an implementation of locality sensitive hashing for spark. > LSH has a number of use cases, most prominent being if the features are not > based in Euclidean space. > > Code, documentation, and small exemplar dataset is available on github: > > https://github.com/mrsqueeze/spark-hash > > Feel free to pass along any comments or issues. > > Enjoy! > > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/locality-sensitive-hashing-for-spark-tp20803.html > > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > > >