> Is 256 some inner maximum too
> in some
> lucene internal that causes this? What is happening is that
> the long
> word is split into smaller words up to 256 and then the min
> and max
> limit applied. Is that correct? I have removed LengthFilter
> and still
> see the splitting at 256 happen. I would like not to have
> this, and
> removed altogheter any word longer than max, wihtout
> decomposing into
> smaller ones. Is there a way to achieve this?
>
> Using lucene 3.0.1
Assuming your Tokenizer extends CharTokenizer:
CharTokenizer.java has this field:
private static final int MAX_WORD_LEN = 255;
you can modify CharTokenizer.java according to your needs.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]