> From: Brian Goetz [mailto:[EMAIL PROTECTED]]
> 
> I think what QP needs to do here is run each discrete term through the
> analyzer, and change its behavior based on that.  For example, if the
> input text is "+A", but the analyzer turns "A" into "B B" (dumb
> example, but you get the idea) it needs to build a PhraseQuery, not a
> TermQuery.  Similarly, if the Analyzer eliminates A completely from
> the token stream, then QP should drop that term from the resulting
> Query that its building.  In other words, have QP parse the 
> query first,
> and then run each sub-part through the analyzer, and adjust 
> for the number
> of tokens returned.

Things are a little more complicated than you describe, but basically this
is not a bad approach.  For the query "+A B" where "A" is a stop word, the
parser would effectively do something like:

  BooleanQuery query = new BooleanQuery();
  query.add(analyze("A"), true, false);
  query.add(analyze("B"), false, false);

The analyze(String) method would run the string through the analyzer and
turn it into a TermQuery or a PhraseQuery, depending on how many terms came
out.

But if the term disappears (as is the case for stop words) then what should
analyze(String) do?  Maybe it can throw an exception, which the parser can
handle, so that the above pseudo code turns into:

  BooleanQuery query = new BooleanQuery();
  try {
    query.add(analyze("A"), true, false);
  } catch (TermDisappeared e) {           // required term: must exist
    throw new RequiredTermDisappeared(e.getMessage());
  }
  try {
    query.add(analyze("B"), false, false);
  } catch (TermDisappeared e) {           // optional term: ignore
exceptions
  }

That looks feasable.  Anyone want to take on re-writing the query parser?

Doug

_______________________________________________
Lucene-dev mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/lucene-dev

Reply via email to