For multi-word queries, I would like to reward documents that contain a more
even distribution of each word and penalize documents that have a skewed
distribution.  For example, if my search query is:

+content:fast +content:car

I would prefer a document that contains each word an equal number of times
over a document that contains the word "fast" 100 times and the word "car" 1
time.  In other words, I would like to compare the scores of each
BooleanQuery term and adjust the score according to the distribution.

Can somebody point me in the right direction as to how I would implement
this?

Thanks,
Andy

Reply via email to