People...
I've created a custom analyser that uses the StandardTokenizer class
to get the tokens from the reader.
It seemed to work fine but I just noticed that some large documents
are not having all their content properly indexed, but just [the starting]
part of them.
After some debuging I've found out that StandardTokenizer reads up
to 10001 tokens from the reader.
Have anybody went through something like that before ? What should I
do as a workaround ?
Thanks !
Flavio
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]