>>For what it's worth Mark (Miller), there *is* a need for "just highlight the query terms without trying to get excerpts" functionality
>>- something a la Google cache (different colours...mmm, nice).

FWIW, the existing highlighter doesn't *have* to fragment - just pass a NullFragmenter to the highlighter. Ideally we'd have one implementation that tackles phrase support and preserves (optional) support for selecting fragments. I can see that to achieve this the existing highlighter design would need to change. Currently the highlighter identifies fragments first (typically using an implementation which arbitrarily chops text after 'n' words) and then selects which of these fragments have the highest density of high-scoring query terms. This logic would need to change to :
1) Use QuerySpansExtractor to identify all the *spans* in the document
2) Use a sliding window to select fragments, taking care to select fragments that wholly contain spans, rather than selecting only part of a span.
3) Mark up the hits.
Clearly, for people uninterested in selecting fragments, step 2 can be skipped.

Cheers
Mark


        
        
                
___________________________________________________________ All new Yahoo! Mail "The new Interface is stunning in its simplicity and ease of use." - PC Magazine http://uk.docs.yahoo.com/nowyoucan.html

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to