It depends on whether the query parser is smart enough to optimize away empty boolean terms. Otherwise, the semantics of "x AND y" (or BooleanQuery with two "MUST" clauses) is the intersection of the documents selected by matching x and the documents selected by matching y. If y selects no documents, the intersection will be empty. Analysis is a separate semantic step from syntactic parsing, so if y is a stopword or a quoted phrase containing only a stopword, it parses fine, but a dumb query parser might generate a TermQuery with an empty term, which will match no documents.

Or, if stopwords are disabled at query time, but were enabled at index time, the TermQuery would refer to a term that cannot be found in the index.

-- Jack Krupansky

-----Original Message----- From: Trejkaz
Sent: Thursday, June 07, 2012 5:44 PM
To: java-user@lucene.apache.org
Subject: Re: easy one? IN and OR stopword help

On Fri, Jun 8, 2012 at 5:35 AM, Jack Krupansky <j...@basetechnology.com> wrote:
Well, if you have defined OR/or and IN/in as stopwords, what is it you expect other than for the analyzer to ignore those terms (which with a boolean “AND” means match nothing)?

Is this behaviour really logical?

If I search for a single phrase like "Jack and Jill", and "and" is a
stop word, it becomes "Jack - Jill", right? And then matches documents
which have Jack and Jill next to each other (although I'm not 100%
sure on whether term positions mess it up for this specific case as I
can't remember whether the term position increments on a stop word or
not. It's irrelevant for the next step in my logic anyway.)

If I search for a single term like "and" and "and" is a stop word, the
equivalent behaviour should be to search for [] (the empty term set),
and every item matches the empty term set, so {X} AND "and" should
return the same as {X} for any query {X}, I would have thought.

Is this some peculiarity with boolean query or query parser implementation?

TX

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to