Hi Uwe,
I think it was my mistake in the code: in my lucene analyzer class I have
implemented the following method:
@Override
protected TokenStreamComponents createComponents(String fieldName,
Reader reader) {
Tokenizer tokenizer = new WhitespaceTokenizer(reader);
TokenStream result = new LowerCaseFilter(tokenizer);
result = new DelimitedPayloadTokenFilter(result, '|', encoder);
TokenStreamComponents tokenStreamComponents = new
TokenStreamComponents(new WhitespaceTokenizer(reader), result);
return tokenStreamComponents;
}
It was a mistake to create WhitespaceTokenizer twice. The correct
implementation is:
@Override
protected TokenStreamComponents createComponents(String fieldName,
Reader reader) {
Tokenizer tokenizer = new WhitespaceTokenizer(reader);
TokenStream result = new LowerCaseFilter(tokenizer);
result = new DelimitedPayloadTokenFilter(result, '|', encoder);
TokenStreamComponents tokenStreamComponents = new
TokenStreamComponents(tokenizer, result);
return tokenStreamComponents;
}
Sorry about the noise!
On 23 April 2015 at 17:14, Uwe Schindler <[email protected]> wrote:
> Of course!
>
> Do you have code to reproduce?
>
> Uwe
>
>
> Am 23. April 2015 15:54:06 MESZ, schrieb Dmitry Kan <
> [email protected]>:
>>
>> Hi,
>>
>> In Lucene 4.10.4 the DelimitedPayloadTokenFilter class seems to violate
>> the contract of the TokenStream. Should I raise a jira? Thanks.
>>
>>
>>
>> java.lang.IllegalStateException: TokenStream contract violation:
>> reset()/close() call missing, reset() called multiple times, or subclass
>> does not call super.reset(). Please see Javadocs of TokenStream class for
>> more information about the correct consuming workflow.
>> at org.apache.lucene.analysis.Tokenizer$1.read(Tokenizer.java:111)
>> at
>> org.apache.lucene.analysis.util.CharacterUtils.readFully(CharacterUtils.java:241)
>> at
>> org.apache.lucene.analysis.util.CharacterUtils$Java5CharacterUtils.fill(CharacterUtils.java:283)
>> at
>> org.apache.lucene.analysis.util.CharacterUtils.fill(CharacterUtils.java:231)
>> at
>> org.apache.lucene.analysis.util.CharTokenizer.incrementToken(CharTokenizer.java:148)
>> at
>> org.apache.lucene.analysis.core.LowerCaseFilter.incrementToken(LowerCaseFilter.java:62)
>> at
>> org.apache.lucene.analysis.payloads.DelimitedPayloadTokenFilter.incrementToken(DelimitedPayloadTokenFilter.java:55)
>>
>>
> --
> Uwe Schindler
> H.-H.-Meier-Allee 63, 28213 Bremen
> http://www.thetaphi.de
>