Hi Joe,

in Lucene 4.6, the TokenStream/Tokenizer APIs got some additional state machine 
checks to ensure that consumers and subclasses of those abstract interfaces are 
implemented in a correct way - they are not easy to understand, because they 
are implemented in that way to ensure they don't affect performance. If your 
test case consumes the Tokenizer/TokenStream in a wrong way (e.g. missing to 
call reset() or setReader() at correct places), an IllegalStateException is 
thrown. The ILLEGAL_STATE_READER is there to ensure that the consumer gets a 
correct exception if it calls setReader() or reset() in the wrong order (or 
multiple times).

The checks in the base class are definitely OK, if you hit the 
IllegalStateException, your have some problems in your implementation of the 
Tokenizer/TokenStream interface (e.g. missing super() calls or calling reset() 
from inside setReader() or whatever). Or, the consumer does not respect the 
full documented workflow: 
http://lucene.apache.org/core/4_6_1/core/org/apache/lucene/analysis/TokenStream.html

If you have TokenFilters in your analysis chain, the source of error may also 
be missing super delegations in reset(), end(),... If you need further help, 
post your implementation of the consumer in your test case or post your 
analysis chain and custom Tokenizers. You may also post the stack trace in 
addition, because this helps to find out what call sequence you have.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

> -----Original Message-----
> From: Joe Wong [mailto:jw...@adacado.com]
> Sent: Thursday, March 20, 2014 8:58 PM
> To: java-user@lucene.apache.org
> Subject: Possible issue with Tokenizer in lucene-analyzers-common-4.6.1
> 
> Hi
> 
> We're planning to upgrade lucene-analyzers-commons 4.3.0 to  4.6.1 . While
> running our unit test with 4.6.1 it fails at
> org.apache.lucene.analysis.Tokenizer on line 88 (setReader method). There
> it checks if input != ILLEGAL_STATE_READER then throws
> IllegalStateException. Should it not be if input == ILLEGAL_STATE_READER?
> 
> Regards,
> Joe


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to