Excerpts from Aldous D. Penaranda's message of Fri Jan 19 09:01:43 -0800 2007:
> The StandardAnalyzer documentation says that it filters
> LetterTokenizer with LowerCaseFilter.

My interpretation of
http://ferret.davebalmain.com/api/classes/Ferret/Analysis/StandardAnalyzer.html
is that StandardAnalyzer uses FULL_ENGLISH_STOP_WORDS as the stopword
list.

Perhaps I'm wrong; I've never verified it empirically. I'm of the
opinion that the whole concept of stopwords is a relic of 1970's
technology and the TREC ad-hoc query paradigm, neither of which are
particularly relevant for modern-day web search, so I typically turn
them off.

-- 
William <[EMAIL PROTECTED]>
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Reply via email to