Sorry, I'm a newbie in OS, and I'm not familiar to the way of updating patches :D I'll try to put my solution here first to receive comments from our community. Since we must differentiate 3 possibilities: must have, may have and must not have; we need at least 2 boolean variables in org.apache.nutch.searcher.Query. In fact, these 2 boolean variables are isRequired and isProhibited.
-In the first step, I define an OR token separately in jj file. This will be put before <WORD>. So it will look like this: <OR: "OR"> -Second, I define a new function called disjunction: void disjunction() : {} { <OR> nonOpOrTerm() } -Third, in the function parse(), I declare a boolean variable disj: boolean disj; -Forth, inside parse(), once we finished looking ahead, we examine the existence of OR token: ( LOOKAHEAD ... )? // check OR (disjunction() { disj = true; })* -Finally, I changed the handling portion in parse(): if (stop && field == Clause.DEFAULT_FIELD && terms.size()==1 && isStopWord(array[0])) { // ignore stop words only when single, unadorned terms in default field } else { if (prohibited) query.addProhibitedPhrase(array, field); else if (disj) query.addOptionalPhrase(array, field); else query.addRequiredPhrase(array, field); } After this point, I have finished changing the jj file. Please note that I also have to add the method addOptionalPhrase() in org.apache.nutch.searcher.Query. This method basically sets isRequired=false and isProhibited=false. The rest has been taken care by Nutch already. Regards, Giang On 3/15/06, Laurent Michenaud <[EMAIL PROTECTED]> wrote: > > I would like to use Boolean Query too :) > > -----Message d'origine----- > De : Alexander Hixon [mailto:[EMAIL PROTECTED] > Envoyé : mercredi 15 mars 2006 08:38 > À : nutch-user@lucene.apache.org > Objet : RE: Boolean OR QueryFilter > > Maybe you could post the code on JIRA, if anyone else wishes to use > Boolean operators in their search queries..? We could probably get a > developer or two to put this in the 0.8 release? Since it IS open source. > ;) > > Just a thought, > Alex > > -----Original Message----- > From: Nguyen Ngoc Giang [mailto:[EMAIL PROTECTED] > Sent: Wednesday, 15 March 2006 3:45 PM > To: nutch-user@lucene.apache.org; [EMAIL PROTECTED] > Subject: Re: Boolean OR QueryFilter > > Hi David, > > I also did a similar task. In fact, I hacked into jj code to add the > definition for OR and NOT. If you need any help, don't hesitate to contact > me :). > > Regards, > Giang > > PS: I also believe that a hack to jj code is necessary. > > On 3/8/06, David Odmark <[EMAIL PROTECTED]> wrote: > > > > Hi all, > > > > We're trying to implement a nutch app (version 0.8) that allows for > > Boolean OR e.g. (this OR that) AND (something OR other). I've found > > some relevent posts in the mailing list archive, but I think I'm > > missing something. For example, here's a snippet from a post from Doug > Cutting: > > > > <snip> > > that said, one can implement OR as a filter (replacing or altering > > BasicQueryFilter) that scans for terms whose text is "OR" in the > > default field. > > </snip> > > > > The problem I'm finding is that the NutchAnalysis analyzer seems to be > > swallowing all boolean terms by the time the QueryFilter is even > > executed (perhaps because OR is a stop word?). To wit: > > > > String queryText = "this OR that"; > > org.apache.nutch.searcher.Query query = > > org.apache.nutch.searcher.Query.parse(queryText, conf); for (int > > i=0;i<query.getTerms().length;i++) { > > System.out.println("Term = " + query.getTerms()[i]); } > > > > This results in output that looks like this: > > > > Term = this > > Term = that > > > > So am I correct in believing that in order to implement boolean OR > > using Nutch search and a QueryFilter, one must also (minimally) hack > > the NutchAnalysis.jj file to produce a new analyzer? Also, given that > > a Nutch Query object doesn't seem to have a method to add a > > non-required Term or Phrase, does that need to be modified as well? > > > > Sorry for the long post, and thanks in advance... > > > > -David Odmark > > > > > > > >