[lucy-dev] Some quick benchmarks

Some quick and completely unscientific benchmarks, indexing 1000 timesthe same 10K ASCII document:


RT = RegexTokenizer
ST = StandardTokenizer
CF = CaseFolder
N  = Normalizer


RT:    2.177s
RT+CF: 3.964s
RT+N:  2.556s
ST:    1.551s
ST+CF: 3.357s
ST+N:  1.931s

It's also interesting that moving the tokenizer in front of the casefolder or normalizer always gave me faster results.


Nick

Reply via email to