Simple co-occurrence counting is at the heart of most large-scale
recommendation systems.  Counting plus simple (but sound) statistical
filtering suffices for a broad range of recommendation tasks with very high
quality results.  For statistical filtering, I typically recommend the G^2
statistic as a heuristic score (see my blog about surprise and
coincidence<http://tdunning.blogspot.com/2008/03/surprise-and-coincidence.html>for
details).  These are completely scalable algorithms but Mahout doesn't
have an implementation.

On Wed, Mar 4, 2009 at 12:55 AM, Sean Owen <[email protected]> wrote:

> I do not know of an algorithm which is by nature efficiently distributable.
> Finding and implementing such a thing would be great.
>



-- 
Ted Dunning, CTO
DeepDyve

Reply via email to