On 8/30/06, Chris Hostetter <[EMAIL PROTECTED]> wrote:
: > 1) hl.scoring=simple (the default) - construct with Query only. May have some matches from other terms, but allows you to highlight different fields to the ones searched. : > 2) hl.scoring=field - constructed with Query and fieldName. Only highlights terms matched in this field by the query. : > 3) hl.scoring=fieldidx - constructed with Query, fieldName and IndexReader. I think the selection of the best fragment(s) will be improved because the terms will be weighted according to their frequency in the index - but this has to be more costly as it calls IndexReader.docFreq for each term. : : So, is there a better way to describe the differences here in a way : 1) an option to specify that all words in the query should be : highlighted on all selected fields. this sounds like hl.style=any : 2) an option to specify that words should be highlighted only if the : query matched the specific field sounds like hl.style=strict : Question: would the phrase query "spider man" cause highlighting of : "the spider bit the man"? : 3) when finding best matches, score rarer terms higher than common terms sounds like hl.style=best
I'll add my 2c. First, let's drop option 2: we don't allow any other customization of the way fragment scoring works, so if field-specific highlighting is enabled, we might as well always enable idf scoring (this can be changed later). Second, I don't see a name that is simple and clear. At the risk of verbosity, how about: requireFieldMatch=true/[false] or fieldMustMatchQuery=[true]/false best, -Mike