[ 
https://issues.apache.org/jira/browse/LUCENE-902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501080
 ] 

Hoss Man commented on LUCENE-902:
---------------------------------

without a unit test demonstrating an actual problem, i'm having a hard time 
udnerstanding what exactly the "bug" is in this issue.

from what i can tell based on the comments and my reading of the patch, Toru is 
concerned about cumulative positionIncrements of tokens being lost when one of 
those tokens is a stop word.  (ie: if indexing multiple names of movies in a 
Document about an actor, and using a positionIncriment of "10" between each 
Field value (ie: movie name), indexing the values "Dirty Harry" and "The Good 
the bad and the Ugly" could result in no gap between the tokens "harry" and 
"good" since "the" is a stop word.

is my understanding of the problem correct?

if so, then i'm not sure how this patch really addresses the problem ... 
besides the fact that it treats "1" as a special case (the problem can come up 
with any positionIncrement) it doesn't seem to make any allowance for the 
situation where multiple stop words appear in sequence.

i'm also not clear on why non stop words immediately following stop words (ie: 
the "else if(flag)" case) are not returned unless their positionIncriment is 1.





> Check on PositionIncrement  with StopFilter.
> --------------------------------------------
>
>                 Key: LUCENE-902
>                 URL: https://issues.apache.org/jira/browse/LUCENE-902
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Analysis
>    Affects Versions: 2.2
>            Reporter: Toru Matsuzawa
>         Attachments: stopfilter.patch
>
>
> PositionIncrement set with Tokenizer is not considered with StopFilter. 
> When PositionIncrement of Token is 1, it is deleted by StopFilter. However, 
> when PositionIncrement of Token following afterwards is 0, it is not deleted. 
> I think that it is necessary to be deleted. Because it is thought same Token 
> when PositionIncrement is 0.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to