Why not just use mahout to do this, there is an item similarity algorithm in 
mahout that does exactly this :)

https://builds.apache.org/job/Mahout-Quality/javadoc/org/apache/mahout/cf/taste/hadoop/similarity/item/ItemSimilarityJob.html

You can use mahout in distributed and non-distributed mode as well.

> From: lcguerreroc...@gmail.com
> Date: Fri, 28 Jun 2013 12:16:57 -0500
> Subject: Content based recommender using lucene/solr
> To: solr-user@lucene.apache.org; java-u...@lucene.apache.org
> 
> Hi,
> 
> I'm using lucene and solr right now in a production environment with an
> index of about a million docs. I'm working on a recommender that basically
> would list the n most similar items to the user based on the current item
> he is viewing.
> 
> I've been thinking of using solr/lucene since I already have all docs
> available and I want a quick version that can be deployed while we work on
> a more robust recommender. How about overriding the default similarity so
> that it scores documents based on the euclidean distance of normalized item
> attributes and then using a morelikethis component to pass in the
> attributes of the item for which I want to generate recommendations? I know
> it has its issues like recomputing scores/normalization/weight application
> at query time which could make this idea unfeasible/impractical. I'm at a
> very preliminary stage right now with this and would love some suggestions
> from experienced users.
> 
> thank you,
> 
> Luis Guerrero
                                          

Reply via email to