On Mar 15, 2013, at 11:25 AM, "Uwe Schindler" <u...@thetaphi.de> wrote:

> Hi,
> 
> The API did not really change.

The API definitely did change, as before you would override the now-final 
tokenStream method.  But you are correct that this was not the root of the 
problem.

> The bug is in your test:
> If you would carefully read the javadocs of the TokenStream interface, you 
> would notice that your consumer does not follow the correct workflow: 
> http://lucene.apache.org/core/4_2_0/core/org/apache/lucene/analysis/TokenStream.html
> 
> In short, before calling incrementToken() the TokenStream must be reset(). 
> This did not change and was always the case. In earlier Lucene versions, lots 
> of TokenStreams were behaving wrong, so we made the basic Tokenizers "fail" 
> in some way. The Exception is not really helpful here, but for performance 
> reasons this was the only way to go.
> 
> Please always take care that the described workflow in the Javadocs is always 
> used from top to bottom (including end() and close()), otherwise behavior of 
> TokenStreams is not guaranteed to be correct.
> 


Thank you, this was exactly the problem.  It would be nice if the tokenizers 
did some checking of state to catch issues like this, or at least emit a 
clearer error message, but I definitely was doing this wrong.




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to