[
https://issues.apache.org/jira/browse/LUCENE-1101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554482
]
Doron Cohen commented on LUCENE-1101:
-------------------------------------
Currently Token.clear() is used only for un-tokenized fields in DocmentsWriter
- Tokenizer implementations of next(Token) do not call it.
I think they can be modified to call it (instead of explicitly reseting just
the pos-incr).
But since these methods already set the value for start-offset, calling these
method might eat the speed-up gained by reusing tokens.
But then again, shouldn't tokenizers also reset the payload info? (seems wrong
to assume there there's no payload in the input reusable token.)
So I guess the right thing to do is to call clear() in all toknizers (3
actually) - will work that path.
> Tokenizers should reset positionIncrement to 1 in their next(Token result)
> ---------------------------------------------------------------------------
>
> Key: LUCENE-1101
> URL: https://issues.apache.org/jira/browse/LUCENE-1101
> Project: Lucene - Java
> Issue Type: Bug
> Affects Versions: 2.3
> Reporter: Doron Cohen
> Assignee: Doron Cohen
> Fix For: 2.3
>
> Attachments: lucene-1101.patch
>
>
> Tokenizers which implement the reuse form of the next method:
> next(Token result)
> should reset the postionIncrement of the returned token to 1.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]