Hello Everyone,

 

I have just gotten a basic Nutch/Hadoop configuration going. My
configuration is 3 Dell 2850's. Each have dual 3.6 ghz processors. The
Master node has 8Gb ram and the two Slaves have 5Gb each.

The hard drive configuration on the master is dual 128Gb 15k rpm In a
raid 0 configuration. 

 

I have each of the slave machines set up with VMWare Esxi and each hosts
4 virtual nutch crawlers each getting 1.2 Gb ram and its own 73Gb 10000
RPM scsi drive giving me a total of   8 slaves.

 

I was doing research and wondering if it would be more effective to just
run the two slave servers without the virtualization each having 5Gb ram
and a larger raid 0?? I was also wondering what settings I can use to
maximize the memory usage on the master? I am currently using Rsync
because I am still adding plugins and it makes it easier to deploy the
plugins to all the machines but if I need to disable it to have a
customized configuration on the master node, that is fine.

 

Thanks,

Chris

Reply via email to