Victor Lee wrote:

Hi,
 I want to search for exact match of "good morning",
not phase match of it. How do I do that in Nutch? For examples, phase match returns doc that has "good
morning sir" or "hello good morning".  But exact match
returns doc that has "good morning" only.

Many thanks.

I'm afraid this won't be possible without changing the index. IF we assume that, then there are several methods to do this:

* add this text to a field that is not tokenized (e.g. "spec" field), and contains just this value

* add a QueryPlugin, which will translate the query
      spec:"good morning"
   into a TermQuery over that field

You could also go this way: modify your index to always include the start and end markers, e.g. __START__ __END__ (you can do this in NutchDocumentAnalyzer). Then write a query plugin which for exact matches rewrites the query to:

   "__START__ query __END__"

This will be parsed into a phrase query, but it will match only the documents which contain this exact phrase...

--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply via email to