Music is another domain where this is a real problem. E.g., "The The", "The Who", not to mention the song and album names.

-Sean

Walter Underwood wrote:
We do a similar thing with a no stopword, no stemming field.

There are a surprising number of movie titles that are entirely
stopwords. "Being There" was the first one I noticed, but
"To be and to have" wins the prize for being all-stopwords
in two languages.

See my list, here:

http://wunderwood.org/most_casual_observer/2007/05/invisible_titles.html

wunder

On 3/21/08 6:14 PM, "Lance Norskog" <[EMAIL PROTECTED]> wrote:

Yes.  Our in-house example is the movie title "The Sound Of Music". Given in
quotes as a phrase this will pull up "anystopword Sound anystopword Music".
For example, "A Sound With Music". Your example is also a test case of ours.
For some Lucenicious reason "six stopwords in a row" does not find anything.

We solved this problem by making a separate indexed field with a simplified
text type: no stopwords. Phrase searches go against the 'rawfield' and word
searches go against it first. You may want to also filter out punctuation or
"Sound Of Music" will not bring up "Sound Of Music!"

Cheers,

Lance Norskog

-----Original Message-----
From: Phillip Farber [mailto:[EMAIL PROTECTED]
Sent: Friday, March 21, 2008 11:11 AM
To: solr-user@lucene.apache.org
Subject: stopwords and phrase queries


Am I correct that if I index with stop words: "to", "be", "or" and "not"
then phrase query "to be or not to be" will not retrieve any documents?

Is there any documentation that discusses the interaction of stop words and
phrase queries?  Thanks.


Phil


Reply via email to