Highlight fragment does not extend to maxDocCharsToAnalyze
----------------------------------------------------------

                 Key: LUCENE-1321
                 URL: https://issues.apache.org/jira/browse/LUCENE-1321
             Project: Lucene - Java
          Issue Type: Bug
          Components: contrib/highlighter
    Affects Versions: 2.4
            Reporter: Lars Kotthoff
            Priority: Minor


The current highlighter code checks whether the total length of the text to 
highlight is strictly smaller than maxDocCharsToAnalyze before adding any text 
remaining after the last token to the fragment. This means that if 
maxDocCharsToAnalyse is set to exactly the length of the text and the last 
token of the text is the term to highlight and is followed by non-token text, 
this non-token text will not be highlighted.

For example, consider the phrase "this is a text with searchterm in it". "In" 
and "it" are not tokenized because they're stopwords. Setting 
maxDocCharsToAnalyze to 36 (the length of the sentence) and searching for 
"searchterm" gives a fragment ending in "searchterm". The expected behaviour is 
to have "in it" at the end of the fragment, since maxDocCharsToAnalyse 
explicitely states that the whole phrase should be considered.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to