I was able to setup nutch searchers in distributed fashion buy creating the search-server.txt files
at the root of the data where Tomcat was running.  I had a total of 1.9 MM URLs slit in half for
each searcher.
I was very surprised to see that the performance numbers I got for this set up was not as good as
I was expecting.  Before I ran this setup, I run the test in a single searcher with 1.9 MM URLs.
The results for the distributed setup were the same or even.
 
One thing that I suspect is that Tomcat is querying each nutch search server synchronously
instead of asynchronously, by querying each server one at the time, because that would explain a lot.
 
Can somebody tell me if this is true??
 
I'm running Nutch 0.5 with very beefy machines.
 
Thanks,
 
Ledio

Reply via email to