Yes. Is there a way to enforce this for all Tokenizers automatically? As
incrementToken() will be abstract in 3.0, there cannot be a default impl. So
all Tokenizers should call clearAttributes() as first call in
incrementToken().

Then we have still the problem of the slow iterator creation (which was
speed up a little bit by removing the unmodifiable wrapper). This can be
solved by using an additional ArrayList in AttributeSource that gets all
AttributeImpl instances, but this would bring an additional initialization
cost() on creating the Tokenizer chain.

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -----Original Message-----
> From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik
> Seeley
> Sent: Monday, August 10, 2009 7:42 PM
> To: java-dev@lucene.apache.org
> Subject: Re: who clears attributes?
> 
> Thinking through this a little more, I don't see an alternative to the
> tokenizer clearing all attributes at the start of incrementToken().
> 
> Consider a DefaultPayloadTokenFilter that only sets a payload if one
> isn't already set - it's clear that this filter can't clear the
> payload attribute, so it must be cleared by the head of the chain -
> the tokenizer.  Right?
> 
> -Yonik
> http://www.lucidimagination.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-dev-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to