Hi all,

We're trying to implement a nutch app (version 0.8) that allows for Boolean OR e.g. (this OR that) AND (something OR other). I've found some relevent posts in the mailing list archive, but I think I'm missing something. For example, here's a snippet from a post from Doug Cutting:

<snip>
that said, one can implement OR as a filter (replacing or altering BasicQueryFilter) that scans for terms whose text is "OR" in the default field.
</snip>

The problem I'm finding is that the NutchAnalysis analyzer seems to be swallowing all boolean terms by the time the QueryFilter is even executed (perhaps because OR is a stop word?). To wit:

String queryText = "this OR that";
org.apache.nutch.searcher.Query query = org.apache.nutch.searcher.Query.parse(queryText, conf);
for (int i=0;i<query.getTerms().length;i++) {
           System.out.println("Term = " + query.getTerms()[i]);
}

This results in output that looks like this:

Term = this
Term = that

So am I correct in believing that in order to implement boolean OR using Nutch search and a QueryFilter, one must also (minimally) hack the NutchAnalysis.jj file to produce a new analyzer? Also, given that a Nutch Query object doesn't seem to have a method to add a non-required Term or Phrase, does that need to be modified as well?

Sorry for the long post, and thanks in advance...

-David Odmark




-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to