[
https://issues.apache.org/jira/browse/LUCENE-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alex Vigdor updated LUCENE-1822:
--------------------------------
Attachment: LUCENE-1822.patch
> FastVectorHighlighter: SimpleFragListBuilder hard-coded 6 char margin is too
> naive
> ----------------------------------------------------------------------------------
>
> Key: LUCENE-1822
> URL: https://issues.apache.org/jira/browse/LUCENE-1822
> Project: Lucene - Java
> Issue Type: Improvement
> Components: contrib/*
> Affects Versions: 2.9
> Environment: any
> Reporter: Alex Vigdor
> Priority: Minor
> Attachments: LUCENE-1822.patch
>
>
> The new FastVectorHighlighter performs extremely well, however I've found in
> testing that the window of text chosen per fragment is often very poor, as it
> is hard coded in SimpleFragListBuilder to always select starting 6 characters
> to the left of the first phrase match in a fragment. When selecting long
> fragments, this often means that there is barely any context before the
> highlighted word, and lots after; even worse, when highlighting a phrase at
> the end of a short text the beginning is cut off, even though the entire
> phrase would fit in the specified fragCharSize. For example, highlighting
> "Punishment" in "Crime and Punishment" returns "e and <b>Punishment</b>" no
> matter what fragCharSize is specified. I am going to attach a patch that
> improves the text window selection by recalculating the starting margin once
> all phrases in the fragment have been identified - this way if a single word
> is matched in a fragment, it will appear in the middle of the highlight,
> instead of 6 characters from the beginning. This way one can also guarantee
> that the entirety of short texts are represented in a fragment by specifying
> a large enough fragCharSize.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]