For multi-word queries, I would like to reward documents that contain a more even distribution of each word and penalize documents that have a skewed distribution. For example, if my search query is:
+content:fast +content:car I would prefer a document that contains each word an equal number of times over a document that contains the word "fast" 100 times and the word "car" 1 time. In other words, I would like to compare the scores of each BooleanQuery term and adjust the score according to the distribution. Can somebody point me in the right direction as to how I would implement this? Thanks, Andy