[ https://issues.apache.org/jira/browse/LUCENE-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12620408#action_12620408 ]
Doron Cohen commented on LUCENE-1350: ------------------------------------- {quote} The non-reuse interface is deprecated. LUCENE-1333 deals with cleaning that up and applying reuse in all of Lucene. To date, it was partially applied to core. This results in sub-optimal performance with Filter chains that use both reuse and non-reuse inputs and filters. {quote} Non-reuse TokenStream API is not deprecated in the trunk. I guess you mean it will be deprecated by LUCENE-1333. {quote} To me, it is not clearcut what a producer or a consumer actually is. Obviously, input streams are producers. Some filters, generate multiple tokens as a replacement for the current one (e.g. NGram, stemming,...). To me, these are producers. {quote} Right, such filters function as producers. Javadocs should say something weaker, like "most filters are consumers" or "filters are usually consumers". {quote} I don't know why the following pattern was not originally used (some filters do this) or why you didn't migrate to this: Token token = input.next(); ... String newTerm = ....; ... token.setTermText(newTerm); return token; This would be faster than cloning and would preserve all fields. {quote} Good point, thanks. So I wonder what's next with this issue. The complete LUCENE-1333 is dated for 2.4. So it seems in place to fix filters behavior now, to preserve payload (and flags, thanks for pointing this out), following the above (reuse) code pattern. Makes sense? > SnowballFilter resets the payload > --------------------------------- > > Key: LUCENE-1350 > URL: https://issues.apache.org/jira/browse/LUCENE-1350 > Project: Lucene - Java > Issue Type: Bug > Components: Analysis, contrib/* > Reporter: Doron Cohen > Assignee: Doron Cohen > Attachments: LUCENE-1350.patch > > > Passing tokens with payloads through SnowballFilter results in tokens with no > payloads. > A workaround for this is to apply stemming first and only then run whatever > logic creates the payload, but this is not always convenient. > Patch to follow that preserves the payload. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]