Three areas need work: 1) The script with sample data that is in the project should be converted into a junit. 2) The current use of the Mahout RecommenderJob and various other bits of Mahout need to be updated to the latest 0.9 candidate (I'm working on this and expect to have it up-to-date before 0.9 is released) 3) An example demo site with Solr needs to be built. I'm doing one, some of Ted's group is doing another. Neither will be completely public I think so another example with sample data would be super helpful.
If you or someone else wants to help with #1 or #2 just fork the repo, let us know what you're doing, and create a push request when you're ready. It's under the Apache license like Mahout. If you want to do #3 I'll provide any help I can. Ping me if you'd like to discuss any of this. I'll update the JIRA with progress on #2 ------------------------------ I've said it before but would love to hear what other's think; the rest of the implementation is simply integrating an app framework with Solr and finding some data. Therefore I'm proceeding with that. What the github project does is prepare data, run the RecommenderJob and the XRecommenderJob (a cross recommender for multiple actions by users' that I built from Mahout DRM jobs) to create the item-item similarity matrix as well as the cross-action similarity matrix. The project then outputs to Solr digestible format CSV files with the originally ingested item and user ids. What I am doing for the demo site is: 1) Mining and updating a sample data set from RottenTomatoes.com from critics reviews. The data set is user id (critic), item id (video), preference (thumbs up or down) as well as a video catalog--working 2) Indexing the similarity matrix with Solr produced by the github project--working 3) Gather user preferences, I'm doing this with a Web UI--working but not deployed 4) Use user preferences as a more-like-this query against the output of the github project. This will produce realtime recommendations from the critic review training data--not implemented yet The actual query and indexing are from code in the app framework. This fits with the architecture in Ted's docs but I've chosen a general purpose app framework for the demo, not Liquid Search. #3 of the areas needing work could use Liquid Search or some other app framework to make Solr result visible but you would need data. I have a sample app in early stages at https://guide.finderbots.com/users/login uname: [email protected], pword: find3rbots It currently caches poster images the first time they are fetched from RT so it will often be slow. It's showing item-item similarities. When you look at a video detail it shows thumbs of 10 similar videos. Since it uses critics for preferences the similar videos are somewhat surprising. Take it easy on the app, it's running in my bedroom closet. On Oct 24, 2013, at 10:49 PM, Manuel Blechschmidt <[email protected]> wrote: Hi Dominik, the most important document is on Ted Dunnings Google drive: https://drive.google.com/folderview?id=0B7t2iY7e93hUNkJSbUtnd1kxUU0&usp=sharing Design Document Here is the corresponding JIRA entry: https://issues.apache.org/jira/browse/MAHOUT-1288 And here it Pats github repo: https://github.com/pferrel/solr-recommender Am 25.10.2013 um 01:55 schrieb Dominik Hübner: > Having seen Ted presenting recommendation as search at the Munich Hadoop > meetup, I remembered the new Solr recommender implemented by Pat. Are there > any chances to contribute? I currently have same spare time, but could not > find the related JIRA entry. > -- Manuel Blechschmidt M.Sc. IT Systems Engineering Dortustr. 57 14467 Potsdam Mobil: 0173/6322621 Twitter: http://twitter.com/Manuel_B
