Every Filter must support reset(). There is no call to it needed in the 
analyzer reusable handling, as the consumer must call reset before calling 
incrementToken() for the first time.

In 3.4 there are sometimes useless extra calls to reset() for backwards reasons 
like on sink, but in trunk aka 4.0 no longer there.

The need to call reset() before consuming is described in TokenStream javadocs.

Uwe
--
Uwe Schindler
H.-H.-Meier-Allee 63, 28213 Bremen
http://www.thetaphi.de



Paul Jakubik <[email protected]> schrieb:

Hi,


I think I found a bug in ReusableAnalyzerBase, but am also wondering if I'm 
simply missing something. Let me describe what I am seeing, and maybe you can 
point out where I'm making bad assumptions.


By using the ReusableAnalyzerBase you can create a single shared analyzer, and 
it contains code to make the interesting parts of your analyzer thread local.


Part of making this work is putting all of the interesting components inside of 
of ReusableAnalyzerBase.TokenStreamComponents.


When you call ReusableAnalyzerBase.reusableTokenStream, it checks if it has a 
thread local TokenStreamComponents, and if so it calls 
TokenStreamComponents.reset(Reader) resetting the token source. This method 
does not reset the TokenStream sink in TokenStreamComponents.


Because of this, if any of the filters in the TokenStream are stateful, you 
have to recreate them instead of resetting them and using them again. So if you 
use a filter like LimitTokenCountFilter or ShingleFilter, you have to recreate 
it, even though these filters have reset methods that could be called.


Am I missing important reasons why TokenStreamComponents.reset is implemented 
as:

    protected boolean reset(final Reader reader) throws IOException {

      source.reset(reader);

      return true;

    }


instead of

    protected boolean reset(final Reader reader) throws IOException {

      source.reset(reader);

      sink.reset();

      return true;

    }


If there is a good reason to avoid resetting the sink here, then would it help 
other people to better document that implementations of 
ReusableAnalyzerBase.createComponents should not create stateful components?


Paul





Reply via email to