On Mar 15, 2013, at 11:25 AM, "Uwe Schindler" <u...@thetaphi.de> wrote:
> Hi, > > The API did not really change. The API definitely did change, as before you would override the now-final tokenStream method. But you are correct that this was not the root of the problem. > The bug is in your test: > If you would carefully read the javadocs of the TokenStream interface, you > would notice that your consumer does not follow the correct workflow: > http://lucene.apache.org/core/4_2_0/core/org/apache/lucene/analysis/TokenStream.html > > In short, before calling incrementToken() the TokenStream must be reset(). > This did not change and was always the case. In earlier Lucene versions, lots > of TokenStreams were behaving wrong, so we made the basic Tokenizers "fail" > in some way. The Exception is not really helpful here, but for performance > reasons this was the only way to go. > > Please always take care that the described workflow in the Javadocs is always > used from top to bottom (including end() and close()), otherwise behavior of > TokenStreams is not guaranteed to be correct. > Thank you, this was exactly the problem. It would be nice if the tokenizers did some checking of state to catch issues like this, or at least emit a clearer error message, but I definitely was doing this wrong. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org