[ 
https://issues.apache.org/jira/browse/LUCENE-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746428#action_12746428
 ] 

Uwe Schindler edited comment on LUCENE-1842 at 8/22/09 1:13 AM:
----------------------------------------------------------------

I still do not understand your proposal. You can always create all tokenizer 
chains at the beginning with exactly one AttributeSource (after LUCENE-1826). 
You are then free to call incrementToken() on all sub-tokenstreams and all 
these calls will put the tokenized values in the same attributes.

Adding a reset(AttributeSource) method would not help really, as you would have 
to do this for the whole Tokenizer chain. If you do it in the wrong way, there 
may be some tokenfilters in the chain that use a different attributesource and 
so on. Because of all these problem and the complexity, we do not want to have 
setters for AttributeSources or changes of AttributeFactory and so on (and 
because of this the design is to make the fields "final" and we have this extra 
warning). During the lifetime of one TokenStream, there is in my opinion no 
real use-case for changing its attribute maps that rectify the added complexity 
and risk for errors. 

The cost of adding Attributes is very low if you reuse TokenStreams, what you 
could even do with your concenatting TokenStream.

      was (Author: thetaphi):
    I still do not understand your proposal. You can always create all 
tokenizer chains at the beginning with exactly one tokenizer (after 
LUCENE-1826). You are then free to call incrementToken() on all 
sub-tokenstreams and all these calls will put the tokenized values in the same 
attributes.

Adding a reset(AttributeSource) method would not help really, as you would have 
to do this for the whole Tokenizer chain. If you do it in the wrong way, there 
may be some tokenfilters in the chain that use a different attributesource and 
so on. Because of all these problem and the complexity, we do not want to have 
setters for AttributeSources or changes of AttributeFactory and so on. During 
the lifetime of one TokenStream, there is in my opinion no real use-case for 
changing its attribute maps that rectify the added complexity and risk for 
errors. 

The cost of adding Attributes is very low if you reuse TokenStreams, what you 
could even do with your concenatting TokenStream.
  
> Add reset(AttributeSource) method to AttributeSource
> ----------------------------------------------------
>
>                 Key: LUCENE-1842
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1842
>             Project: Lucene - Java
>          Issue Type: Wish
>          Components: Analysis
>            Reporter: Tim Smith
>            Priority: Minor
>
> Originally proposed in LUCENE-1826
> Proposing the addition of the following method to AttributeSource
> {code}
> public void reset(AttributeSource input) {
>     if (input == null) {
>       throw new IllegalArgumentException("input AttributeSource must not be 
> null");
>     }
>     this.attributes = input.attributes;
>     this.attributeImpls = input.attributeImpls;
>     this.factory = input.factory;
> }
> {code}
> Impacts:
> * requires all TokenStreams/TokenFIlters/etc to call addAttribute() in their 
> reset() method, not in their constructor
> * requires making AttributeSource.attributes and 
> AttributeSource.attributesImpl non-final
> Advantages:
> Allows creating only a single actual AttributeSource per thread that can then 
> be used for indexing with a multitude of TokenStream/Tokenizer combinations 
> (allowing utmost reuse of TokenStream/Tokenizer instances)
> this results in only a single "attributes"/"attributesImpl" map being 
> required per thread
> addAttribute() calls will almost always return right away (will only be 
> "initialized" once per thread)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to