Re: [gsoc] Collaborative filtering algorithms

QIU, Yin Wed, 04 Mar 2009 23:27:40 -0800

Hi!

> Yes there is a framework in the code for running a Recommender across
> machines in Hadoop, and a Hadoop job which distributes part of the
> processing for a slope one recommender.


I don't know slope one recommender yet. Maybe I should read that first
to know how you manage to divide the tasks. However, a little
explanation in advance would be appreciated.

> Both could use testing, refinement and enhancement.

By "refinement and enhancement", could you be more specific?

> I do not know of an algorithm which is by nature efficiently distributable.
> Finding and implementing such a thing would be great.

Actually I don't know either. But I have two naive clues.

First, as Hofmann introduced pLSA into CF [3] and I heard SVD on
MapReduce had been tackled (is that true?), is it possible to port his
algorithm to Mahout?

Another one. I know that Canny proposed an algorithm [1] that runs on
different nodes, theoretically without a central database, though for
the sake of privacy. Wang et al. also suggested CF for P2P systems
[2]. But I don't know if they are helpful for defining Hadoop jobs.

> I would be the person to contact about this so feel free to run your
> proposals by me.

I get it. And I won't let this discussion go off the list. :-)


[1] J. Canny. Collaborative Filtering with Privacy. In Proceedings of
IEEE Symposium on Security and Privacy, 2002.
[2] J. Wang, J. Pouwelse, R. L. Lagendijk and M. J. T. Reinders.
Distributed collaborative filtering for peer-to-peer file sharing
systems. In SAC '06: Proceedings of the 2006 ACM symposium on Applied
computing, p/p. 1026-1030. 2006.
[3] T. Hofmann. Latent semantic models for collaborative filtering.
ACM Transactions on Information Systems, volume 22, p.p. 89-115. 2004.

-- 
Yin Qiu

3

Re: [gsoc] Collaborative filtering algorithms

Reply via email to