Oh, I think I misread this. You are right, the article similarity metric is based on content, not on ratings. This is good -- probably a better metric, and injects new information into the system.
In that case, this sounds more like item-based recommender systems. Rather than find similar users, then see what they liked (user-based recommenders), you see which items are similar to the items the user likes. It's really just a user-based recommender turned on its side, but, is generally superior, you can inject some external idea of item similarity rather than use correlation among ratings. That external data is, well, more data to inform recommendations, and is generally fixed over time and precomputable, whereas user-user similarity doesn't have this property. This is roughly the approach Amazon takes AFAIK. It may be they're just using different words for the same thing or in fact it is something different and I am just not familiar with it. On Fri, Aug 29, 2008 at 12:02 AM, Satish Dandu <[EMAIL PROTECTED]> wrote: > Hi Sean, > >>> 1. Findory's personalization used a type of hybrid collaborative filtering >>> algorithm that recommended articles based on a combination of >>> similarity of content and articles that tended to interested other >>> Findory users with similar tastes. > > >Interesting -- yeah, that would be a hybrid of user-based and > item-based approaches. > > When you say hybrid of user-based & item based approaches (as both forms > collaborative approach), how can we get articles with similar content. > From my understanding, I think Findory uses some kind of "Content Based > Filtering" + "Collaborative Based Filtering". Content based filtering may be > used to fetch documents with similar content. Best Example would be making > use of some sort of Lucene's "morelikeThis" or "Similar" queries. Correct me > if i am wrong. > > Regards, > -Satish Dandu > > > -----Original Message----- > From: Sean Owen [mailto:[EMAIL PROTECTED] > Sent: Thursday, 28 August 2008 2:49 PM > To: [email protected] > Subject: Re: Tasty Findory > > On Thu, Aug 28, 2008 at 9:20 PM, Otis Gospodnetic > <[EMAIL PROTECTED]> wrote: >> 1. Findory's personalization used a type of hybrid collaborative filtering >> algorithm that recommended articles based on a combination of >> similarity of content and articles that tended to interested other >> Findory users with similar tastes. > > Interesting -- yeah, that would be a hybrid of user-based and > item-based approaches. > > Usually, in a user-based approach, you find similar users, and then > guess a rating for a new item by averaging the rating for that item of > similar users -- weighted by the user similarity of course. > > Here, I imagine that in Findory you don't have a rating per se for > articles, just a boolean yes/no. So you substitute a similarity metric > between those items the user has read and a given new item. > > Yeah... that does add up to an interesting new approach, likely. I'd > have to digest that a bit more to think about how to implement it > right. > > >> The way Findory does this is >> that it pre-computes as much of the expensive personalization as it >> can. Much of the task of matching interests to content is moved to an >> offline batch process. The online task of personalization, the part >> while the user is waiting, is reduced to a few thousand data lookups. > > Ah-ha, yeah, computing offline is not surprising. Good news, because > that is the only option for the sorts of parallelization we are > considering via Hadoop. > > There is a notion of "Rescorer" in the code which allows for injecting > arbitrary logic to re-rank recommendations. That maps to the "online > personalization" part, and indeed I think that is useful to allow for > some cheap, real-time logic to affect rankings, on top of > recommendations computed offline. >
