Also, Lucene automagically does weighting which is very, very similar to exactly what you want.
To Sean's question, the trick is that Lucene can store a list of item-item links that were filtered by cooccurrence statistics to form a binary matrix of interesting links. Then if you query with a user's recent history of items as a query, you get back a list of items formed by considering different items to be weighted according to rarity. The result is quite good, very fast. The reasons are that Lucene *is* weighted matrix multiplication of just the right sort. This is what I was going to talk about in detail at ApacheCon. On Mon, Jul 13, 2009 at 4:16 AM, Grant Ingersoll <[email protected]>wrote: > I think Ted's suggestion is you'll find Lucene will be _a lot faster_ for > this task as you don't need all the other trappings of a DB. > > > > On Jul 13, 2009, at 4:36 AM, Sean Owen wrote: > > How does Lucene go from item-item links to recommendations? I'm >> missing where the notion of user ratings, or even users, come into >> play, or the strength of the association. >> >> If the issue is really just storing the item-item links efficiently in >> a way that isn't in memory, how about I cook up a JDBC-based >> implementation? Seem more direct. >> >>
