Yeah, it's not as simple as just averaging the output of two similarity functions, as those values may not be at all comparable. That would be nice. If the two similarity values have a well-understood meaning you may be able to make a meaningful combination of the two. I think this is hard to do correctly. People on this list may have better thoughts about situations in which it might be valid to combine similarity functions into a new one.
Yes, one solution is to zoom way out and not try to combine two approaches at a low level but at a high level. For example what if you recommend from two recommenders, and only keep the items that appear in both outputs? That's pretty defensible. On Wed, Oct 26, 2011 at 4:31 PM, Sören Brunk <[email protected]> wrote: > Hi, > > I'm trying to combine content based recommendation using metadata like tags > with the collaborative filtering mahout offers. As suggested in the Mahout > book, I'm creating a custom item similarity based on some attributes to > inject content based recommendation into the framework. > > But I'm not sure what's the best way to combine both approaches within one > recommender. My idea was to create several similarities based on content, > e.g. TagSimilarity, CategorySimilarity etc. (or one similarity by > vectorizing all attribute information into one big vector). > For CF I use something LogLikelihoodSimilarity as the data contains only > boolean preferences. > Then I would create some kind of combined similarity that takes the output > of the individual similarities and combines them, possibly weighted. > > As an alternative I thought of creating individual recommenders for the > content based and the CF approach and then combine the results of both > recommenders somehow. > > Any suggestions what would the best approach here? Or would you do it in a > completely different way? > > Thanks, > Sören > > >
