[ https://issues.apache.org/jira/browse/LUCENE-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12745667#action_12745667 ]
Tim Smith commented on LUCENE-1826: ----------------------------------- This is further complicated by the fact that Tokenizers are often "held onto" in a thread local so, Tokenizer.reset(Reader) should also take an AttributeSource in order to really reset things properly also, then all TokenFilters/TokenStreams would be required to reinit their held onto "attributes" at reset() time, not at constructor time otherwise they could be holding onto stale attributes > All Tokenizer implementations should have constructor that takes an > AttributeSource > ----------------------------------------------------------------------------------- > > Key: LUCENE-1826 > URL: https://issues.apache.org/jira/browse/LUCENE-1826 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis > Affects Versions: 2.9 > Reporter: Tim Smith > Fix For: 2.9 > > > I have a TokenStream implementation that joins together multiple sub > TokenStreams (i then do additional filtering on top of this, so i can't just > have the indexer do the merging) > in 2.4, this worked fine. > once one sub stream was exhausted, i just started using the next stream > however, in 2.9, this is very difficult, and requires copying Term buffers > for every token being aggregated > however, if all the sub TokenStreams share the same AttributeSource, and my > "concat" TokenStream shares the same AttributeSource, this goes back to being > very simple (and very efficient) > So for example, i would like to see the following constructor added to > StandardTokenizer: > {code} > public StandardTokenizer(AttributeSource source, Reader input, boolean > replaceInvalidAcronym) { > super(source); > ... > } > {code} > would likewise want similar constructors added to all Tokenizer sub classes > provided by lucene -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org