Just in case anybody else comes up against this one. If someone is
starting the search server from the same directory that you are using to
run dfs and mapreduce as I was then the search server picks ups the
hadoop-site.xml and nutch-site.xml files that are in conf. If those
files point the filesystem to a DFS then the search server will be
searching the DFS for the directory you provide upon startup. If you
are searching a local filesystem for the index you will need to point
the filesystem to local in the configuration xml files.
My problem was that I was searching the DFS for a local path. I thought
that when I started the search server it took a local path, but it
doesn't. It takes a path and then finds which filesystem to use from
the configuration xml files, in my case in the conf directory. When I
changed it to local, everything worked fine.
Dennis
Dennis Kubes wrote:
Can someone explain how to setup distributed searching. I have a
nutch-site.xml file setup like this:
<configuration>
<property>
<name>fs.default.name</name>
<value>local</value>
</property>
<property>
<name>searcher.dir</name>
<value>C:\SERVERS\tomcat\webapps\ROOT\WEB-INF\classes</value>
</property>
</configuration>
With a search-servers.txt file in the classes directory. The servers
file has entries that look like this:
andromeda02 2388
The server on the machine is started and running with this command.
bin/nutch server 2388 /d01/crawl
I can see it running and I can see the connect output, but searches
return no results. The /d01/crawl directory contains indexes which is
a merged indexes directory, linkdb, crawldb, and segments. Am I doing
something obviously wrong here?
Dennis