jqq wrote:
Hi,
I have two computers, which are PC1 and PC2.
PC1:windows xp,cygwin,tomcat,IP 10.0.0.2
PC2:windows xp,cygwin,IP 10.0.0.3
The PC1 and PC2 have crawled different data.
For searching multiple indexes,my configuration is as follows:
1. configure /conf/slaves file on the both computers,the file contains
the flollowing:
10.0.0.2
10.0.0.3
conf/slaves doesn't configure the searching - it's only needed when
starting / stopping a map-reduce cluster.
2.I created a file called search-servers.txt
PC1: c:\nutch\servers\search-servers.txt
PC2: c:\nutch\servers\search-servers.txt
This file contains the following(host,port):
10.0.0.2 9988
10.0.0.3 9988
3.open c:\tomcat\webapps\nutch\WEB-INF\classes\nutch-site.xml and set
the searcher.dir property,so the property is:
<name>searcher.dir</name>
<value>c:\nutch\servers</value>
In this directory you should put a file search-servers.txt that contains:
10.0.0.2 9988
10.0.0.3 9988
4.start the search servers by typing:
PC1: ./bin/nutch server 9988 /cygdrive/e/crawl
PC2: ./bin/nutch server 9988 /cygdrive/e/crawl
start tomcat and go to:http://10.0.0.2/nutch/ , my search results is 0
hits.tomcat's log is:
DistributedSearch- Querying segments from search servers...
DistributedSearch- STATS:2 servers,0 segments.
Why are 0 segments?
Common mistake is also to use hadoop-site.xml that configures Hadoop FS
layer to use the distributed filesystem (DFS), while the data is located
on the local filesystem.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com