Sorry, I'm a newbie in OS, and I'm not familiar to the way of updating
patches :D
I'll try to put my solution here first to receive comments from our
community. Since we must differentiate 3 possibilities: must have, may have
and must not have; we need at least 2 boolean variables in
org.apache.nutch.searcher.Query. In fact, these 2 boolean variables are
isRequired and isProhibited.

-In the first step, I define an OR token separately in jj file. This will be
put before <WORD>. So it will look like this:
<OR: "OR">

-Second, I define a new function called disjunction:
void disjunction() :
{}
{
    <OR> nonOpOrTerm()
}

-Third, in the function parse(), I declare a boolean variable disj:
boolean disj;

-Forth, inside parse(), once we finished looking ahead, we examine the
existence of OR token:
( LOOKAHEAD ... )?
// check OR
(disjunction() { disj = true; })*

-Finally, I changed the handling portion in parse():
if (stop
          && field == Clause.DEFAULT_FIELD
          && terms.size()==1
          && isStopWord(array[0])) {
        // ignore stop words only when single, unadorned terms in default
field
      } else {
        if (prohibited)
          query.addProhibitedPhrase(array, field);
        else if (disj)
          query.addOptionalPhrase(array, field);
        else
          query.addRequiredPhrase(array, field);
      }

  After this point, I have finished changing the jj file. Please note that I
also have to add the method addOptionalPhrase() in
org.apache.nutch.searcher.Query. This method basically sets isRequired=false
and isProhibited=false. The rest has been taken care by Nutch already.

  Regards,
  Giang


On 3/15/06, Laurent Michenaud <[EMAIL PROTECTED]> wrote:
>
> I would like to use Boolean Query too :)
>
> -----Message d'origine-----
> De : Alexander Hixon [mailto:[EMAIL PROTECTED]
> Envoyé : mercredi 15 mars 2006 08:38
> À : nutch-user@lucene.apache.org
> Objet : RE: Boolean OR QueryFilter
>
> Maybe you could post the code on JIRA, if anyone else wishes to use
> Boolean operators in their search queries..? We could probably get a
> developer or two to put this in the 0.8 release? Since it IS open source.
> ;)
>
> Just a thought,
> Alex
>
> -----Original Message-----
> From: Nguyen Ngoc Giang [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, 15 March 2006 3:45 PM
> To: nutch-user@lucene.apache.org; [EMAIL PROTECTED]
> Subject: Re: Boolean OR QueryFilter
>
>   Hi David,
>
>   I also did a similar task. In fact, I hacked into jj code to add the
> definition for OR and NOT. If you need any help, don't hesitate to contact
> me :).
>
>   Regards,
>    Giang
>
> PS: I also believe that a hack to jj code is necessary.
>
> On 3/8/06, David Odmark <[EMAIL PROTECTED]> wrote:
> >
> > Hi all,
> >
> > We're trying to implement a nutch app (version 0.8) that allows for
> > Boolean OR e.g. (this OR that) AND (something OR other). I've found
> > some relevent posts in the mailing list archive, but I think I'm
> > missing something. For example, here's a snippet from a post from Doug
> Cutting:
> >
> > <snip>
> > that said, one can implement OR as a filter (replacing or altering
> > BasicQueryFilter) that scans for terms whose text is "OR" in the
> > default field.
> > </snip>
> >
> > The problem I'm finding is that the NutchAnalysis analyzer seems to be
> > swallowing all boolean terms by the time the QueryFilter is even
> > executed (perhaps because OR is a stop word?). To wit:
> >
> > String queryText = "this OR that";
> > org.apache.nutch.searcher.Query query =
> > org.apache.nutch.searcher.Query.parse(queryText, conf); for (int
> > i=0;i<query.getTerms().length;i++) {
> >             System.out.println("Term = " + query.getTerms()[i]); }
> >
> > This results in output that looks like this:
> >
> > Term = this
> > Term = that
> >
> > So am I correct in believing that in order to implement boolean OR
> > using Nutch search and a QueryFilter, one must also (minimally) hack
> > the NutchAnalysis.jj file to produce a new analyzer? Also, given that
> > a Nutch Query object doesn't seem to have a method to add a
> > non-required Term or Phrase, does that need to be modified as well?
> >
> > Sorry for the long post, and thanks in advance...
> >
> > -David Odmark
> >
> >
> >
>
>

Reply via email to