[ https://issues.apache.org/jira/browse/LUCENE-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764168#action_12764168 ]
Chas Emerick commented on LUCENE-1822: -------------------------------------- Thank you for the patch. I agree, the context surrounding each fragment could definitely be improved. > FastVectorHighlighter: SimpleFragListBuilder hard-coded 6 char margin is too > naive > ---------------------------------------------------------------------------------- > > Key: LUCENE-1822 > URL: https://issues.apache.org/jira/browse/LUCENE-1822 > Project: Lucene - Java > Issue Type: Improvement > Components: contrib/* > Affects Versions: 2.9 > Environment: any > Reporter: Alex Vigdor > Priority: Minor > Attachments: LUCENE-1822.patch > > > The new FastVectorHighlighter performs extremely well, however I've found in > testing that the window of text chosen per fragment is often very poor, as it > is hard coded in SimpleFragListBuilder to always select starting 6 characters > to the left of the first phrase match in a fragment. When selecting long > fragments, this often means that there is barely any context before the > highlighted word, and lots after; even worse, when highlighting a phrase at > the end of a short text the beginning is cut off, even though the entire > phrase would fit in the specified fragCharSize. For example, highlighting > "Punishment" in "Crime and Punishment" returns "e and <b>Punishment</b>" no > matter what fragCharSize is specified. I am going to attach a patch that > improves the text window selection by recalculating the starting margin once > all phrases in the fragment have been identified - this way if a single word > is matched in a fragment, it will appear in the middle of the highlight, > instead of 6 characters from the beginning. This way one can also guarantee > that the entirety of short texts are represented in a fragment by specifying > a large enough fragCharSize. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org