Yes. It still needs some work—the github repo is hard to use without a better explanation of Solr integration. It kind of leaves you most of the way there without a clear idea of how to do the rest.
Also thinking about porting to Spark since all it really needs is RSJ and Matrix Multiply, not the entire recommender and cross-recommender. On Apr 6, 2014, at 1:21 PM, Andrew Musselman <[email protected]> wrote: Pat, do you still want help putting this into a new mahout/examples, or work out how to do the distribution via "github pointer"? There's an open bug for that. > On Apr 6, 2014, at 1:13 PM, Sebastian Schelter <[email protected]> wrote: > > The top 3 recommendations "based on videos you liked" are very good! > > Nice job. > > >> On 04/06/2014 07:26 PM, Pat Ferrel wrote: >> After having integrated several versions of the Mahout and Myrrix >> recommenders at fairly large scale. I was interested in solving three >> problems that these did not directly provide for: >> 1) realtime queries for recs using data not yet incorporated into the >> training set. Myrrix allows this but Mahout using the hadoop mr version does >> not. >> 2) cross-recommendations from two or more action types (say purchase and >> detail-view) >> 3) blending metadata and user preference data to return recs (for example >> category & user preferences => recs) >> >> Using Solr + Mahout provided an amazingly flexible and performant way to do >> this. Ted wrote about his experience with this basic approach in his recent >> book. Take user preferences, run them through RowSimilarityJob and you get >> an item by item similarity Matrix. This is the core of an item-based >> cooccurrence recommender. If you take the similarity matrix, and convert it >> into a list of tokens per row, you have something Solr can index. If you >> then use a user’s history as a query on the indexed data you get an ordered >> list of recommendations. >> >> When I set out to do #1 and #3 the need for CF data AND metadata was the >> first problem. So I mined the web for video reviews and video metadata. Then >> logging any users who visit the site will lead to data for #2 and #1. >> >> The demo site is https://guide.finderbots.com and instructions are at the >> end of this for anyone who would like to test it out. As a crude user test >> there is a procedure we ask you to follow to help gather quality of >> recommendations data. It’s running out of my closet over Comcast so if it’s >> down I may have tripped over a cord, sorry try again later. >> >> There are a bunch of different methods for making recs illustrated on the >> site. One method that illustrates blending metadata uses preference data >> from you, and metadata to bias and filter recs. Imagine that you have >> trained the system with your preferences by making some video picks. Now >> imagine you’d like to get recommendations for Comedies from Neflix based on >> your previous video preferences. This is done with a single Solr query on >> indexed video fields that hold genre, similar videos (from the similarity >> matrix), and sources. The query finds similar videos to the ones you have >> liked, with the genre “Comedy” boosted by some amount, but only those that >> have at least one source = “Netflix”. >> >> I’ll be doing some blog posts covering the specifics of how each rec type is >> done, the site and DB architecture, and Solr setup. >> >> The project uses the Solr recommender prep code here: >> https://github.com/pferrel/solr-recommender >> >> BTW I plan to publish obfuscated usage data in the github repo. >> >> begin form letter ======================================= >> >> Please use a very newly updated browser (latest Firefox, Chrome, Safari, and >> nothing older than IE10) the site doesn’t yet check browser compatibility >> but relies on HTML5 and CSS3 rather heavily. >> >> 1) go to https://guide.finderbots.com/users/sign_up to create an account >> 2) go to https://guide.finderbots.com/trainers to ’train' the recommender >> hit thumbs up on videos you like. There are 20 pages of training videos, you >> can leave at any time but if you can go through them all it would be >> appreciated. >> 3) go to https://guide.finderbots.com/guides/recommend to immediately get >> personalized recs from your training data. If you completed the trainer >> check the top line of recs, count how many are videos you liked or would >> like to see. Scroll right or left to see a total of 24 in four batches of 6. >> If you could report to me the total you thought were good recs it would be >> greatly appreciated. >> 4) browse videos by various criteria here: >> https://guide.finderbots.com/guides These are not recommendations, they are >> simply a catalog. >> 5) control how you browse videos by clicking the gears icon. You can set all >> videos to be from one or more sources here. If you choose Netflix alone >> (don’t forget to uncheck ‘all’) then recs and browsed videos will all be >> available on Netflix. >
