A better configuration would be something like this:

1) 3 web servers
2) 3 (multiple) index servers
3) Site servers connect to indexes through distributed search setup using search-servers.txt file.

Hadoop is there to help process the index, deploy would still be on a local file system for searching. Depending on the index size, the distributed search servers allow you to scale up. Each web server can connect to all search servers using the NutchBean class programatically and using search-servers.txt.

Dennis

Rinesh1 wrote:
Hi Ronny,
   I am have not worked on hadoop.I went through the description of hadoop
on the site.
   What I have understood is hadoop WRT Nutch may be breaking down index I
into I1,I2,I3,..In and may help in increasing the speed.In this case all the
nutch instances are sharing the same crawl folder.

   Can hadoop be used for making copies CI1,CI2,..CIn .. of index I and make
nutch instance Ni to use CIi ..
Please inform if I am clear.

Regards,
Rinesh

Ronny-28 wrote:
Hadoop is there for you

Rinesh1 wrote:
Hi All,
    I needed help on deploying nutch for loadbalanced environment.
I have 3 web server and 3 app server . Nutch is to be deployed in all the 3 app servers .
    For simplicity I was thinking is it possible to share the same crawl
by
3 nutch web applications.
Thanks in advance, Rinesh ------------------------------------------------------------------------


No virus found in this incoming message.
Checked by AVG - http://www.avg.com Version: 8.0.176 / Virus Database: 270.9.14/1829 - Release Date:
12/4/2008 2:59 PM

--
******************************************************************
Ronny .M.
MPUTA.com
Africa's Search
http://www.mputa.com
******************************************************************




Reply via email to