On Samstag 05 August 2006 22:31, Yonik Seeley wrote:

> Stop words and stemming always make literal searching less precise,
> with the general benefit of greater matching power (more general) and
> smaller index size.

That's why I gave the "t-online" example: it makes the search result look 
incorrect but hardly helps reduce index size. "t" and "s" were probably 
added so "don't" doesn't get indexed as "don", "t", but this doesn't 
happen anyway as the StandardTokenizer keeps "don't" as a single token. 
"'s" is cut off in StandardFilter.

In general, this is only a default list and people will need to adapt it 
anyway. So we should only add the words which are probably stopwords for 
most users.

Regards
 Daniel

-- 
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to