Re: How about a LSH recommender ?

ke xie Wed, 13 Apr 2011 00:44:49 -0700

Ok, I would try to implement a none-distributed one. Actually I have a
python version now.

But I have a problem. When doing min-hash, the matrix should be either 1 or
0, and then do the hash functions. Then how about rating data? If the matrix
is filled with 1~5 numbers, should we convert them use some treshould and
convert the rating to 1 if the rating is more than the treshould?

This is the reference I read about LSH. check it out (chapter 3)
http://infolab.stanford.edu/~ullman/mmds.html

On Wed, Apr 13, 2011 at 3:25 PM, Ted Dunning <[email protected]> wrote:

> Sure.
>
> LSH is a fine candidate for parallelism and scaling.
>
> I would recommend starting small and testing as you go rather than leaping
> into a parallelized full-fledged implementation.  Look for other open-source
> implementaions of LSH algorithms.
>
> Be warned that the parameter selection for LSH can be pretty tricky (so I
> hear, anyway).  You should pick a reasonable and realistic test problem so
> that you can experiment with that.
>
>
> On Wed, Apr 13, 2011 at 12:19 AM, ke xie <[email protected]> wrote:
>
>> Can we implement one and contribute into the mahout project? Any
>> suggestions?
>>
>
>

-- 
Name: Ke Xie   Eddy
Research Group of Information Retrieval
State Key Laboratory of Intelligent Technology and Systems
Tsinghua University

Re: How about a LSH recommender ?

Reply via email to