[ https://issues.apache.org/jira/browse/LUCENE-1321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609220#action_12609220 ]
Mark Miller commented on LUCENE-1321: ------------------------------------- Thanks Lars. Nice catch - not an easy spot <g> Looks good to me. When I get a few free minutes I'll go over it a bit more, but on first inspection, certainly looks like the right fix and all tests pass. > Highlight fragment does not extend to maxDocCharsToAnalyze > ---------------------------------------------------------- > > Key: LUCENE-1321 > URL: https://issues.apache.org/jira/browse/LUCENE-1321 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/highlighter > Affects Versions: 2.4 > Reporter: Lars Kotthoff > Assignee: Mark Miller > Priority: Minor > Attachments: LUCENE-1321.patch > > > The current highlighter code checks whether the total length of the text to > highlight is strictly smaller than maxDocCharsToAnalyze before adding any > text remaining after the last token to the fragment. This means that if > maxDocCharsToAnalyse is set to exactly the length of the text and the last > token of the text is the term to highlight and is followed by non-token text, > this non-token text will not be highlighted. > For example, consider the phrase "this is a text with searchterm in it". "In" > and "it" are not tokenized because they're stopwords. Setting > maxDocCharsToAnalyze to 36 (the length of the sentence) and searching for > "searchterm" gives a fragment ending in "searchterm". The expected behaviour > is to have "in it" at the end of the fragment, since maxDocCharsToAnalyse > explicitely states that the whole phrase should be considered. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]