Three areas need work:
1) The script with sample data that is in the project should be converted into 
a junit.
2) The current use of the Mahout RecommenderJob and various other bits of 
Mahout need to be updated to the latest 0.9 candidate (I'm working on this and 
expect to have it up-to-date before 0.9 is released)
3) An example demo site with Solr needs to be built. I'm doing one, some of 
Ted's group is doing another. Neither will be completely public I think so 
another example with sample data would be super helpful.

If you or someone else wants to help with #1 or #2 just fork the repo, let us 
know what you're doing, and create a push request when you're ready. It's under 
the Apache license like Mahout. If you want to do #3 I'll provide any help I 
can. Ping me if you'd like to discuss any of this.

I'll update the JIRA with progress on #2

------------------------------

I've said it before but would love to hear what other's think; the rest of the 
implementation is simply integrating an app framework with Solr and finding 
some data. Therefore I'm proceeding with that.

What the github project does is prepare data, run the RecommenderJob and the 
XRecommenderJob (a cross recommender for multiple actions by users' that I 
built from Mahout DRM jobs) to create the item-item similarity matrix as well 
as the cross-action similarity matrix. The project then outputs to Solr 
digestible format CSV files with the originally ingested item and user ids. 

What I am doing for the demo site is:
1) Mining and updating a sample data set from RottenTomatoes.com from critics 
reviews. The data set is user id (critic), item id (video), preference (thumbs 
up or down) as well as a video catalog--working
2) Indexing the similarity matrix with Solr produced by the github 
project--working
3) Gather user preferences, I'm doing this with a Web UI--working but not 
deployed
4) Use user preferences as a more-like-this query against the output of the 
github project. This will produce realtime recommendations from the critic 
review training data--not implemented yet

The actual query and indexing are from code in the app framework. This fits 
with the architecture in Ted's docs but I've chosen a general purpose app 
framework for the demo, not Liquid Search. #3 of the areas needing work could 
use Liquid Search or some other app framework to make Solr result visible but 
you would need data.

I have a sample app in early stages at https://guide.finderbots.com/users/login 
uname: [email protected], pword: find3rbots It currently caches poster 
images the first time they are fetched from RT so it will often be slow. It's 
showing item-item similarities. When you look at a video detail it shows thumbs 
of 10 similar videos. Since it uses critics for preferences the similar videos 
are somewhat surprising. 

Take it easy on the app, it's running in my bedroom closet.

On Oct 24, 2013, at 10:49 PM, Manuel Blechschmidt <[email protected]> 
wrote:

Hi Dominik,
the most important document is on Ted   Dunnings Google drive:

https://drive.google.com/folderview?id=0B7t2iY7e93hUNkJSbUtnd1kxUU0&usp=sharing

Design Document

Here is the corresponding JIRA entry:
https://issues.apache.org/jira/browse/MAHOUT-1288

And here it Pats github repo:
https://github.com/pferrel/solr-recommender


Am 25.10.2013 um 01:55 schrieb Dominik Hübner:

> Having seen Ted presenting recommendation as search at the Munich Hadoop 
> meetup, I remembered the new Solr recommender implemented by Pat. Are there 
> any chances to contribute? I currently have same spare time, but could not 
> find the related JIRA entry.
> 

-- 
Manuel Blechschmidt
M.Sc. IT Systems Engineering
Dortustr. 57
14467 Potsdam
Mobil: 0173/6322621
Twitter: http://twitter.com/Manuel_B


Reply via email to