We have a custom "tagger" application which identifies certain entities (such as companies, etc.) and applies a "relevance" value to each entity based upon overall relevance in some document.
Then we index these "tags" into Lucene index by storing them in an indexed field (same name, different values), for example "company=A, company=B, company=C",etc. I know how to set the boost on each field according the relevance value from our tagging application. However, sorting does not seem to work properly, since according to documentation all boost values per document under fields of the same name are actually combined by multiplying together: >From http://lucene.apache.org/java/docs/scoring.html: "For each field of a document, all boosts of that field (i.e. all boosts under the same field name in that doc) are multiplied." So if I have two document, each with some entities: Doc 1: A (100%), B (50%), C (25%) Doc2: A(75%), D (50%) Then query for A should return Doc1 ahead of Doc2. But seems like what happens is this: Doc1 boost = 1.0 * 0.5 * 0.25 = 0.125 Doc2 boost = 0.75 * 0.50 = 0.375 Therefore query for A returns Doc2 ahead of Doc1. Is there a way around this (besides creating a different field name for each tag)? Can I create custom similarity or scoring classes to handle this at query time somehow? Thanks, Bob