> On Mon, Aug 10, 2009 at 12:44 PM, Uwe Schindler<u...@thetaphi.de> wrote:
> >the CharTokenizer should only clear the TermAttribute, as it is only
> using this attribute.

I changed this in the latest patch for
https://issues.apache.org/jira/browse/LUCENE-1796

> It's certainly not clear to me - is there an established convention?
> Either Tokenizer clears all attributes, or each tokenizer clears those
> attributes it cares about.  But in the latter case, wouldn't that
> potentially cause multiple TokenFilters to clear the same attribute?

Clearing attributes in TokenFilters is not the best. The problem is, that
calling clear() on an AttributeImpl may not only clear the directly
referenced values, the multi-attribute implementations like
Token/TokenWrapper currently used, always clear all 6 standard attributes.
Because of this, I would only clear attributes in TokenStream/Tokenizer, but
then per default for all Tokenizers. Maybe we should implement this. The
problem with that is still the iterator creation, but I have no better
solution as Maps only work with iterators for enumerating values... :(

Uwe


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to