As a note of warning: I did find StandardTokenizer to be the major culprit in my tokenizing benchmarks (avg 75ms for 16k sized docs). I have found I can live without StandardTokenizer in my apps.
FYI, the message with Mark's timings can be found at:
http://nagoya.apache.org/eyebrowse/[EMAIL PROTECTED]&msgId=1413989
According to these, if your documents average 16k, then a 10-hit result page would require just 66ms to generate highlights using SimpleAnalyzer.
Doug
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
