2013/2/28 Steve Rowe <sar...@gmail.com>:

> EnglishAnalyzer has used PorterStemmer instead of the English Snowball 
> stemmer since it was created in 2010 as part of LUCENE-2055[2].  I think this 
> is an oversight: EnglishAnalyzer should incorporate the best English stemmer 
> we've got, and Martin Porter says the Porter2 stemmer is better[1].  Robert 
> Muir (who wrote EnglishAnalyzer), if you're reading, what do you think?

This was intentional actually. The default was a tradeoff of
"benefits" (which affect less than 5% of english vocabulary, if you
read around the snowball site), versus a much more significant
performance difference as a "default".

For example when i did tests of indexing both short and long texts

http://find.searchhub.org/document/c1d3301b71dab5ca#46a8351089a98aec

Thats overall indexing speed, not just text analysis.

It might be that this guy is faster these days (we've done some
improvements) too.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to