At first glance, it doesn't seem like a recommender problem. You know which words the user uses frequently, and you know which terms describe products. It's just a search problem as Ted says -- minus even the recommendation phase.
Is that all you want? then try Lucene, probably. Or is it something different? On Tue, Jul 26, 2011 at 9:49 PM, Ted Dunning <[email protected]> wrote: > +Mahout user mailing list > > On Tue, Jul 26, 2011 at 12:38 PM, Srinivas Kasturi > <[email protected]>wrote: > >> ... I came across your blog entry on surprise and coincidence, and wondered >> if you can help me navigate what seems to be a confusing world of >> recommendation algorithms. The problem statement is this: >> >> 1. I have information at a user level in the form of a tag cloud: Words >> they have used and liked, along with a count of the frequency of incidence. >> > > Excellent. This is a user x word matrix. > > >> 2. I would like to use this information to run through a set of around 20 >> million product pages, and suggest to them the top 100 that they are most >> likely to enjoy. >> > > There are several ways to do this. > > One simple way is to use a binary recommender to recommend words to the user > and then submit the resulting (long-ish) query to a search engine. You > might pick a related subset of the recommended words as the query in order > to get a shorter and more focused query. > > This, in some way, is the surprise and coincidence problem, isn't it? >> > > Yes. It is! > > >> I am hoping to use one of the Mahout algorithms ( >> https://cwiki.apache.org/confluence/display/MAHOUT/Algorithms), but can't, >> for the life of me, figure out which one is the closest fit. >> > > Firstly there are programs that look at something like your user x term data > to find terms that occur anomalously often. Secondly, there are > recommendation systems that would let you recommend additional words to the > user. > > I am sure that others will have other suggestions as well. >
