This might end up being more of a Lucene question, but anyway... For a multivalued field, it appears that term frequency is calculated as something a little like:
sum(tf(value1), ..., tf(valueN)) I'd rather my score not give preference based on how *many* of the values in the multivalued field matched, I want it to give preference based on the value that matched *best*. In other words, something more like: max(tf(value1), ..., tf(valueN)) Put another way, I want a search like q=mvf:foo against a document with a multivalued field: mvf: [ "foo" ] to get scored the exact same as a document with a multivalued field: mvf: [ "foo", "foo" ] but worse than a document with a multivalued field: mvf: [ "foo foo" ] I'm guessing this'd require a custom Similarity implementation, but I'm beginning to wonder if even that is low enough level. Other thoughts? This seems like a pretty obvious desire. Thanks.