[ 
https://issues.apache.org/jira/browse/LUCENE-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744830#action_12744830
 ] 

Alex Vigdor edited comment on LUCENE-1824 at 8/18/09 7:25 PM:
--------------------------------------------------------------

Actually a couple of the existing tests specifically check for the faulty 
behavior - the attached patch for SimpleFragmentsBuilderTest tests for the 
non-truncating behavior implemented in the patch.  For example, where the prior 
test looked for "ssing <b>speed</b>", it now looks for " processing 
<b>speed</b>".  While fixing the tests I noticed an off-by-1 error in the 
orginal patch, which I have updated.


      was (Author: alexvigdor):
    Actually a couple of the existing tests specifically check for the faulty 
behavior - the attached patch for SimpleFragmentsBuilderTest tests for the 
non-truncating behavior implemented in the patch.  For example, where the prior 
test looked for "ssing <b>speed</b>", it now looks for " processing 
<b>speed</b>".

  
> FastVectorHighlighter truncates words at beginning and end of fragments
> -----------------------------------------------------------------------
>
>                 Key: LUCENE-1824
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1824
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/*
>         Environment: any
>            Reporter: Alex Vigdor
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1824-test.patch, LUCENE-1824.patch
>
>
> FastVectorHighlighter does not take word boundaries into consideration when 
> building fragments, so that in most cases the first and last word of a 
> fragment are truncated.  This makes the highlights less legible than they 
> should be.  I will attach a patch to BaseFragmentBuilder that resolves this 
> by expanding the start and end boundaries of the fragment to the first 
> whitespace character on either side of the fragment, or the beginning or end 
> of the source text, whichever comes first.  This significantly improves 
> legibility, at the cost of returning a slightly larger number of characters 
> than specified for the fragment size.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to