On Sep 6, 2013, at 9:33am, Pat Ferrel wrote: > Been building the scaffold for demonstrating the Solr + Mahout recommenders. > Have mined rotten tomatoes for reviews and movies. Browsing, simple search, > and item-item similarities are working in the UX. > > One of the unique things about the Solr recommender is online recs. Two > scenarios come to mind: > 1) ask the user to pick from among a list of videos, taking the picks as > preferences and making recs. Make more and see if recs improve. > 2) watch the users' detail views during a browsing session and make recs > based on those in realtime. A sort of "are you looking for something like > this?" recommender. > > For #1 I've seen several examples (BTW very few give instant recs). Not sure > how they pick what to rate. It seems to me a mix of popular and the videos > with the most varying ratings would be best. Since we have thumbs up and down > it would be simple to find individual videos with a high degree of both love > and hate. Intuitively this would seem to help find the birds of a feather > among the reviewers and help put the user in with the right set with the > fewest preferences required.
> > #2 seems straightforward. No idea if it will be useful. If #2 doesn't seem > useful is may be modified to become the typical, makes recs based on all > reviews but also includes recent reviews not yet in the training data. That's > OK since we'd want to do it anyway. > > One nice thing about the implementation is that the Mahout Item-Based > recommender output is available also so for any user in the training data > we'll be able to show Solr recs and Mahout only recs side by side. > > Any thoughts on these experiments? Especially how to pick examples for the > user in #1 to rate. I'd probably try to cluster in advance, then at run-time randomly pick N (e.g. 10) clusters, and for each cluster randomly pick a video that's close to the centroid. -- Ken -------------------------- Ken Krugler +1 530-210-6378 http://www.scaleunlimited.com custom big data solutions & training Hadoop, Cascading, Cassandra & Solr
