A better configuration would be something like this:
1) 3 web servers
2) 3 (multiple) index servers
3) Site servers connect to indexes through distributed search setup
using search-servers.txt file.
Hadoop is there to help process the index, deploy would still be on a
local file system for searching. Depending on the index size, the
distributed search servers allow you to scale up. Each web server can
connect to all search servers using the NutchBean class programatically
and using search-servers.txt.
Dennis
Rinesh1 wrote:
Hi Ronny,
I am have not worked on hadoop.I went through the description of hadoop
on the site.
What I have understood is hadoop WRT Nutch may be breaking down index I
into I1,I2,I3,..In and may help in increasing the speed.In this case all the
nutch instances are sharing the same crawl folder.
Can hadoop be used for making copies CI1,CI2,..CIn .. of index I and make
nutch instance Ni to use CIi ..
Please inform if I am clear.
Regards,
Rinesh
Ronny-28 wrote:
Hadoop is there for you
Rinesh1 wrote:
Hi All,
I needed help on deploying nutch for loadbalanced environment.
I have 3 web server and 3 app server .
Nutch is to be deployed in all the 3 app servers .
For simplicity I was thinking is it possible to share the same crawl
by
3 nutch web applications.
Thanks in advance,
Rinesh
------------------------------------------------------------------------
No virus found in this incoming message.
Checked by AVG - http://www.avg.com
Version: 8.0.176 / Virus Database: 270.9.14/1829 - Release Date:
12/4/2008 2:59 PM
--
******************************************************************
Ronny .M.
MPUTA.com
Africa's Search
http://www.mputa.com
******************************************************************