Re: Setting up a recommender

Michael Sokolov Mon, 22 Jul 2013 11:12:22 -0700

On 07/22/2013 12:20 PM, Pat Ferrel wrote:


My understanding of the Solr proposal puts B's row similarity matrix in a vector per 
item. That means each row is turned into "terms" = external IDs--not sure how 
the weights of each term are encoded.

This is the key question for me. The best idea I've had is to usetermFreq as a proxy for weight. It's only an integer, so there arescaling issues to consider, but you can apply a per-field weight tomanage that. Also, Lucene (and Solr) doesn't provide an obvious way toload term frequencies directly: probably the simplest thing to do isjust to repeat the cross-term N times and let the text analysis takecare of counting them. Inefficient, but probably the quickest way toget going. Alternatively, there are some lower level Lucene indexingAPIs (DocFieldConsumer et al) which I haven't really plumbed entirely,but would allow for more direct loading of fields.

Then one probably wants to override the scoring in some way (unlessTFIDF is the way to go somehow??)

Re: Setting up a recommender

Reply via email to