jqq wrote:
Hi,
I have two computers, which are PC1 and PC2.
        PC1:windows xp,cygwin,tomcat,IP 10.0.0.2
        PC2:windows xp,cygwin,IP 10.0.0.3
The PC1 and PC2 have crawled different data.
For searching multiple indexes,my configuration is as follows:
1. configure /conf/slaves file on the both computers,the file contains the flollowing:
                   10.0.0.2
                   10.0.0.3

conf/slaves doesn't configure the searching - it's only needed when starting / stopping a map-reduce cluster.

2.I created a file called search-servers.txt
           PC1: c:\nutch\servers\search-servers.txt
           PC2: c:\nutch\servers\search-servers.txt
  This file contains the following(host,port):
          10.0.0.2 9988
          10.0.0.3 9988
3.open c:\tomcat\webapps\nutch\WEB-INF\classes\nutch-site.xml and set the searcher.dir property,so the property is:
              <name>searcher.dir</name>
              <value>c:\nutch\servers</value>

In this directory you should put a file search-servers.txt that contains:

10.0.0.2 9988
10.0.0.3 9988

4.start the search servers by typing:
             PC1: ./bin/nutch server 9988 /cygdrive/e/crawl
             PC2: ./bin/nutch server 9988 /cygdrive/e/crawl

start tomcat and go to:http://10.0.0.2/nutch/ , my search results is 0 hits.tomcat's log is:
 DistributedSearch- Querying segments from search servers...
 DistributedSearch- STATS:2 servers,0 segments.

Why are 0 segments?

Common mistake is also to use hadoop-site.xml that configures Hadoop FS layer to use the distributed filesystem (DFS), while the data is located on the local filesystem.



--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply via email to