In my opinion, it is completely unneeded to clear the attributes in CharTokenizer. The TermAttribute and OffsetAttribute is always initialized correctly (at least set to termLength gets 0), when incrementToken() returns true.
I would simply remove the call to clearAttributes() at all. ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: [email protected] > -----Original Message----- > From: Uwe Schindler [mailto:[email protected]] > Sent: Monday, August 10, 2009 6:44 PM > To: [email protected]; [email protected] > Subject: RE: who clears attributes? > > I already removed the unmodifiable iterator, so one new instance is > removed > (see the JIRA issue). But you are right, the CharTokenizer should only > clear > the TermAttribute, as it is only using this attribute. > > ----- > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: [email protected] > > > -----Original Message----- > > From: [email protected] [mailto:[email protected]] On Behalf Of Yonik > > Seeley > > Sent: Monday, August 10, 2009 6:01 PM > > To: [email protected] > > Subject: who clears attributes? > > > > CharTokenizer.incrementToken() clears *all* attributes in the entire > > tokenizer chain. > > StandardTokenizer.incrementToken() clears only the term attribute. > > > > So... which is right? Seems like the tokenizer should be responsible? > > > > On a performance related note, CharTokenizer.clearAttribtes() could be > > more efficient - 2 new objects (the unmodifiable map and the iterator > > object) are created for every incrementToken. > > > > -Yonik > > http://www.lucidimagination.com > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [email protected] > > For additional commands, e-mail: [email protected] > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
