+Mahout user mailing list On Tue, Jul 26, 2011 at 12:38 PM, Srinivas Kasturi <[email protected]>wrote:
> ... I came across your blog entry on surprise and coincidence, and wondered > if you can help me navigate what seems to be a confusing world of > recommendation algorithms. The problem statement is this: > > 1. I have information at a user level in the form of a tag cloud: Words > they have used and liked, along with a count of the frequency of incidence. > Excellent. This is a user x word matrix. > 2. I would like to use this information to run through a set of around 20 > million product pages, and suggest to them the top 100 that they are most > likely to enjoy. > There are several ways to do this. One simple way is to use a binary recommender to recommend words to the user and then submit the resulting (long-ish) query to a search engine. You might pick a related subset of the recommended words as the query in order to get a shorter and more focused query. This, in some way, is the surprise and coincidence problem, isn't it? > Yes. It is! > I am hoping to use one of the Mahout algorithms ( > https://cwiki.apache.org/confluence/display/MAHOUT/Algorithms), but can't, > for the life of me, figure out which one is the closest fit. > Firstly there are programs that look at something like your user x term data to find terms that occur anomalously often. Secondly, there are recommendation systems that would let you recommend additional words to the user. I am sure that others will have other suggestions as well.
