Please keep this thread going as I am also curious to know why this has been 'forked'. I am sure that most of this lies within the original OPIC filter but I still can't understand why straight forward lucene queries have not been used within the application.
On 7/6/07, Kai_testing Middleton <[EMAIL PROTECTED]> wrote:
I've been reading up on NUTCH-479 "Support for OR queries" but I must be missing something obvious because I don't understand what the JIRA is about: https://issues.apache.org/jira/browse/NUTCH-479 Description: There have been many requests from users to extend Nutch query syntax to add support for OR queries, in addition to the implicit AND and NOT queries supported now. Ok, so I guess what I don't understand is what is the "Nutch query syntax"? The main discussion I found on nutch-user is this: http://osdir.com/ml/search.nutch.devel/2004-02/msg00007.html I was wondering why the query syntax is so limited. There are no OR queries, there are no fielded queries, or fuzzy, or approximate... Why? The underlying index supports all these operations. I notice by looking at the or.patch file (https://issues.apache.org/jira/secure/attachment/12360659/or.patch) that one of the programs under consideration is: nutch/searcher/Query.java The code for this is distinct from lucene/search/Query.java It looks like this is an architecture issue that I don't understand. If nutch is an "extension" of lucene, why does it define a different Query class? Why don't we just use the Lucene code to query the indexes? Does this have something to do with the nutch webapp (nutch.war)? What is the historical genesis of this issue (or is that even relevant)? ____________________________________________________________________________________ We won't tell. Get more on shows you hate to love (and love to hate): Yahoo! TV's Guilty Pleasures list. http://tv.yahoo.com/collections/265
-- "Conscious decisions by conscious minds are what make reality real"
