On 7/23/13 7:26 PM, Pat Ferrel wrote:
Honestly not trying to make this more complicated but…



 From past experience I strongly suspect item similarity rank is not something 
we want to lose so unless someone has a better idea I'll just order the IDs in 
the fields and call it good for now.


If I understand you correctly, you are concerned about just throwing all the items in without regard to order, or weight). I think Ted's suggestion was not to worry about that, but if you do have time and want to tackle this, one thing you can do is to add an item multiple times. For example, suppose you have items A, B, C, ... with A ranked highest. Then index a "document" in Solr like this:

A A A B B C

this will end up giving A a higher frequency count in the index.

The number of repeats would be kind of arbitrary. You might want to make it a linear function of rank or a quantized version of the similarity score.

But this might end up being a noise-level effect ... it's probably not worth losing sleep over. On the other hand, it's probably less useful to order the IDs since once they get put in the index the token "order" is stored as a "position" which isn't (usually) used for scoring, although I suppose some custom scorer could do that, too.

-Mike

Reply via email to