After having integrated several versions of the Mahout and Myrrix recommenders 
at fairly large scale. I was interested in solving three problems that these 
did not directly provide for:
1) realtime queries for recs using data not yet incorporated into the training 
set. Myrrix allows this but Mahout using the hadoop mr version does not.
2) cross-recommendations from two or more action types (say purchase and 
detail-view)
3) blending metadata and user preference data to return recs (for example 
category & user preferences => recs)

Using Solr + Mahout provided an amazingly flexible and performant way to do 
this. Ted wrote about his experience with this basic approach in his recent 
book. Take user preferences, run them through RowSimilarityJob and you get an 
item by item similarity Matrix. This is the core of an item-based cooccurrence 
recommender. If you take the similarity matrix, and convert it into a list of 
tokens per row, you have something Solr can index. If you then use a user’s 
history as a query on the indexed data you get an ordered list of 
recommendations.

When I set out to do #1 and #3 the need for CF data AND metadata was the first 
problem. So I mined the web for video reviews and video metadata. Then logging 
any users who visit the site will lead to data for #2 and #1.

The demo site is https://guide.finderbots.com and instructions are at the end 
of this for anyone who would like to test it out. As a crude user test there is 
a procedure we ask you to follow to help gather quality of recommendations 
data. It’s running out of my closet over Comcast so if it’s down I may have 
tripped over a cord, sorry try again later.

There are a bunch of different methods for making recs illustrated on the site. 
One method that illustrates blending metadata uses preference data from you, 
and metadata to bias and filter recs. Imagine that you have trained the system 
with your preferences by making some video picks. Now imagine you’d like to get 
recommendations for Comedies from Neflix based on your previous video 
preferences. This is done with a single Solr query on indexed video fields that 
hold genre, similar videos (from the similarity matrix), and sources. The query 
finds similar videos to the ones you have liked, with the genre “Comedy” 
boosted by some amount, but only those that have at least one source = 
“Netflix”. 

I’ll be doing some blog posts covering the specifics of how each rec type is 
done, the site and DB architecture, and Solr setup.

The project uses the Solr recommender prep code here: 
https://github.com/pferrel/solr-recommender

BTW I plan to publish obfuscated usage data in the github repo.

begin form letter =======================================

Please use a very newly updated browser (latest Firefox, Chrome, Safari, and 
nothing older than IE10) the site doesn’t yet check browser compatibility but 
relies on HTML5 and CSS3 rather heavily.

1) go to https://guide.finderbots.com/users/sign_up to create an account
2) go to https://guide.finderbots.com/trainers to ’train' the recommender hit 
thumbs up on videos you like. There are 20 pages of training videos, you can 
leave at any time but if you can go through them all it would be appreciated.
3) go to https://guide.finderbots.com/guides/recommend to immediately get 
personalized recs from your training data. If you completed the trainer check 
the top line of recs, count how many are videos you liked or would like to see. 
Scroll right or left to see a total of 24 in four batches of 6. If you could 
report to me the total you thought were good recs it would be greatly 
appreciated. 
4) browse videos by various criteria here: https://guide.finderbots.com/guides 
These are not recommendations, they are simply a catalog.
5) control how you browse videos by clicking the gears icon. You can set all 
videos to be from one or more sources here. If you choose Netflix alone (don’t 
forget to uncheck ‘all’) then recs and browsed videos will all be available on 
Netflix.


Reply via email to