[ 
https://issues.apache.org/jira/browse/LUCENE-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexandre Patry updated LUCENE-5019:
------------------------------------

    Description: 
In SimpleSpanFragmenter, when a query term is followed by a stop word, the 
fragment will run until the end of the document.

When a query term is encountered (line 80), SimpleSpanFragmenter waits for the 
token following it before allowing the fragment to end (lines 68 to 72). When a 
stop word follows the query word (or any token with a position increment 
greater than 1), its position is skipped and the token SimpleSpanFragmenter is 
waiting for never arrive.

The attached patch fixes that by waiting for the first token following the 
query word instead of the token at the position after the query term.

  was:
In SimpleFragmentScorer, when a query term is followed by a stop word, the 
fragment will run until the end of the document.

When a query term is encountered (line 80), SimpleFragmentScorer waits for the 
token following it before allowing the fragment to end (lines 68 to 72). When a 
stop word follows the query word (or any token with a position increment 
greater than 1), its position is skipped and the token SimpleFragmentScorer is 
waiting for never arrive.

The attached patch fixes that by waiting for the first token following the 
query word instead of the token at the position after the query term.

        Summary: SimpleSpanFragmenter can create very long fragments  (was: 
SimpleFragmentScorer can create very long fragments)
    
> SimpleSpanFragmenter can create very long fragments
> ---------------------------------------------------
>
>                 Key: LUCENE-5019
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5019
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: modules/highlighter
>    Affects Versions: 4.3
>            Reporter: Alexandre Patry
>            Priority: Minor
>         Attachments: simple-span-fragmenter.patch
>
>
> In SimpleSpanFragmenter, when a query term is followed by a stop word, the 
> fragment will run until the end of the document.
> When a query term is encountered (line 80), SimpleSpanFragmenter waits for 
> the token following it before allowing the fragment to end (lines 68 to 72). 
> When a stop word follows the query word (or any token with a position 
> increment greater than 1), its position is skipped and the token 
> SimpleSpanFragmenter is waiting for never arrive.
> The attached patch fixes that by waiting for the first token following the 
> query word instead of the token at the position after the query term.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to