Problems with StandardTokenizer

Flavio Eduardo de Cordova Mon, 07 Jul 2003 16:27:28 -0700

People...

        I've created a custom analyser that uses the StandardTokenizer class
to get the tokens from the reader.
        It seemed to work fine but I just noticed that some large documents
are not having all their content properly indexed, but just [the starting]
part of them.
        After some debuging I've found out that StandardTokenizer reads up
to 10001 tokens from the reader.


        Have anybody went through something like that before ? What should I
do as a workaround ?

Thanks !

Flavio

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Problems with StandardTokenizer

Reply via email to