[
https://issues.apache.org/jira/browse/LUCENE-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744872#action_12744872
]
Koji Sekiguchi commented on LUCENE-1824:
----------------------------------------
Alex,
I don't have much time to look into this patch but I understand the requirement.
Why I named *Simple* FragmentsBuilder because it simply makes fragments without
concern for boundaries. I designed FragmentsBuilder can be pluggable, so that
any other FragmentsBuilders can be written/contributed, e.g.
WhitespaceFragmentsBuilder, SentenceAwareFragmentsBuilder, etc. I think adding
new FragmentsBuilders (plus test cases) is better than modifying existing
FragmentsBuilders. Don't forget that some languages (CJK) don't use period or
whitespace for boundaries of words/sentences when you write new
FragmentsBuilders.
> FastVectorHighlighter truncates words at beginning and end of fragments
> -----------------------------------------------------------------------
>
> Key: LUCENE-1824
> URL: https://issues.apache.org/jira/browse/LUCENE-1824
> Project: Lucene - Java
> Issue Type: Improvement
> Components: contrib/*
> Environment: any
> Reporter: Alex Vigdor
> Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1824-test.patch, LUCENE-1824.patch
>
>
> FastVectorHighlighter does not take word boundaries into consideration when
> building fragments, so that in most cases the first and last word of a
> fragment are truncated. This makes the highlights less legible than they
> should be. I will attach a patch to BaseFragmentBuilder that resolves this
> by expanding the start and end boundaries of the fragment to the first
> whitespace character on either side of the fragment, or the beginning or end
> of the source text, whichever comes first. This significantly improves
> legibility, at the cost of returning a slightly larger number of characters
> than specified for the fragment size.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]