Hi,
yes, I guess having the full strength of Lucene-based queries would be
nice. That would as well solve the boolean queries-question I had a few
days ago :-)
Ravi, doesn't Lucene also allow querying of other fields? Is there any
possibility to add that feature to your proposal?
In general: What is the advantage of the current nutch-parser instead of
going with the Lucene-based one?
Regards,
Stefan
Ravi Chintakunta wrote:
> Hi Cristina,
>
> You can achieve this by modifying the IndexSearcher to take the query
> String as an argument and then use
>
> org.apache.lucene.queryParser.QueryParser's parse(String ) method to
> parse the query string. The modified method in IndexSearcher would
> look as below:
>
> public Hits search(String queryString, int numHits,
> String dedupField, String sortField, boolean
> reverse) throws IOException {
>
> org.apache.lucene.queryParser.QueryParser parser = new
> org.apache.lucene.queryParser.QueryParser("content", new
> org.apache.lucene.analysis.standard.StandardAnalyzer());
>
> org.apache.lucene.search.Query luceneQuery = parser.parse(queryString);
>
> return translateHits
> (optimizer.optimize(luceneQuery, luceneSearcher, numHits,
> sortField, reverse),
> dedupField, sortField);
> }
>
> For this you have to modify the code in search.jsp and NutchBean too,
> so that you are passing on the raw query string to IndexSearcher.
>
> Note that with this approach, you are limiting the search to the content
> field.
>
>
> - Ravi Chintakunta
>
>
>
> On 10/4/06, Cristina Belderrain <[EMAIL PROTECTED]> wrote:
>> Hello,
>>
>> we all know that Lucene supports, among others, boolean queries. Even
>> though Nutch is built on Lucene, boolean clauses are removed by Nutch
>> filters so boolean queries end up as "flat" queries where terms are
>> implicitly connected by an OR operator, as far as I can see.
>>
>> Is there any simple way to turn off the filtering so a boolean query
>> remains as such after it is submitted to Nutch?
>>
>> Just in case a simple way doesn't exist, Ravi Chintakunta suggests the
>> following workaround:
>>
>> "We have to modify the analyzer and add more plugins to Nutch
>> to use the Lucene's query syntax. Or we have to directly use
>> Lucene's Query Parser. I tried the second approach by modifying
>> org.apache.nutch.searcher.IndexSearcher and that seems to work."
>>
>> Can anyone please elaborate on what Ravi actually means by "modifying
>> org.apache.nutch.searcher.IndexSearcher"? Which methods are supposed
>> to be modified and how?
>>
>> It would be really nice to know how to do this. I believe many other
>> Nutch users would also benefit from an answer to this question.
>>
>> Thanks so much,
>>
>> Cristina
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general