AW: Searching multiple indexes with Nutch

Koch Martina Tue, 16 Dec 2008 23:16:39 -0800

Hi Vijay,

I'm not sure about the answer to your first question, therefore I leave it to 
the others.


Question No. 2 can be achieved by using distributed search: 
Start your search servers (one server per index): 
bin/nutch server <SERVER PORT> <CRAWL DIRECTORY> (use absolute paths)
Create a file named "search-servers.txt" in a directory of your choice. Each 
line specifies a search server by server host and port:
<SERVER HOST> <SERVER PORT>
In the Nutch-site.xml of your servlet container change the property 
"searcher.dir" to point to the directory where you stored the 
search-servers.txt file.
Restart your servlet container application.
That's it.

Kind regards,
Martina

-----Ursprüngliche Nachricht-----
Von: [email protected] [mailto:[email protected]] Im Auftrag von 
Vijay Krishnan
Gesendet: 16 December 2008 21:57
An: [email protected]
Betreff: Searching multiple indexes with Nutch

Hi all,

       I have two questions, both pertaining to nutch 0.9.
1. Is it possible to make nutch search from the indexes directory directly
without running IndexMerger on it to generate the index directory, possibly
for a small performance hit?

2.  Suppose I have two directories crawl1/ and crawl2/ each of which contain
the crawldb, linkdb, segments, indexes and index directories. Is there an
easy way by which I can get nutch to return search results from both the
crawl1/ and crawl2/ directories? Can I simply configure it in
nutch-default.xml or nutch-site.xml or a similar configuration file or would
I need to somehow run two instances of nutch, get results for each query
from both and manually put them together before serving them? In case I need
to do the latter, is there a clean way to do it?


Thanks,
Vijay

AW: Searching multiple indexes with Nutch

Reply via email to