I tested the attached patch, all tests still compile and work as exspected (as CharStream extends Reader).
I think I should open an issue? Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -----Original Message----- > From: Uwe Schindler [mailto:u...@thetaphi.de] > Sent: Thursday, September 10, 2009 4:54 PM > To: java-dev@lucene.apache.org > Subject: Problem with CharStream and Tokenizers with custom reset(Reader) > method > > When reviewing the new CharStream code added to Tokenizers, I found a > serious problem with backwards compatibility and other Tokenizers, that do > not override reset(CharStream). > > The problem is, that e.g. CharTokenizer only overrides reset(Reader): > > public void reset(Reader input) throws IOException { > super.reset(input); > bufferIndex = 0; > offset = 0; > dataLen = 0; > } > > If you reset such a Tokenizer with another CharStream (not a Reader), this > method will never be called and breaking the whole Tokenizer. > > As CharStream extends Reader, I propose to remove this reset(CharStream > method) and simply do an instanceof check to detect if the supplied Reader > is no CharStream and wrap it. We could also remove the extra ctor (because > most Tokenizers have no support for passing CharStreams). If the ctor also > checks with instanceof and warps as needed the code is backwards > compatible > and we do not need to add additional ctors in subclasses. > > As this instanceof check is always done in CharReader.get() why not remove > ctor(CharStream) and reset(CharStream) completely? > > Any thoughts? > > I would like to fix this somehow before RC4, I', sorry :( > > Uwe > > ----- > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org
Tokenizer-CharStream-fix.patch
Description: Binary data
--------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org