[
https://issues.apache.org/jira/browse/LUCENE-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Koji Sekiguchi updated LUCENE-1822:
-----------------------------------
Attachment: LUCENE-1822.patch
Updated the patch for current trunk.
> FastVectorHighlighter: SimpleFragListBuilder hard-coded 6 char margin is too
> naive
> ----------------------------------------------------------------------------------
>
> Key: LUCENE-1822
> URL: https://issues.apache.org/jira/browse/LUCENE-1822
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/highlighter
> Affects Versions: 2.9
> Environment: any
> Reporter: Alex Vigdor
> Assignee: Koji Sekiguchi
> Priority: Minor
> Attachments: LUCENE-1822.patch, LUCENE-1822.patch
>
>
> The new FastVectorHighlighter performs extremely well, however I've found in
> testing that the window of text chosen per fragment is often very poor, as it
> is hard coded in SimpleFragListBuilder to always select starting 6 characters
> to the left of the first phrase match in a fragment. When selecting long
> fragments, this often means that there is barely any context before the
> highlighted word, and lots after; even worse, when highlighting a phrase at
> the end of a short text the beginning is cut off, even though the entire
> phrase would fit in the specified fragCharSize. For example, highlighting
> "Punishment" in "Crime and Punishment" returns "e and <b>Punishment</b>" no
> matter what fragCharSize is specified. I am going to attach a patch that
> improves the text window selection by recalculating the starting margin once
> all phrases in the fragment have been identified - this way if a single word
> is matched in a fragment, it will appear in the middle of the highlight,
> instead of 6 characters from the beginning. This way one can also guarantee
> that the entirety of short texts are represented in a fragment by specifying
> a large enough fragCharSize.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]