Hello Everyone,
I have just gotten a basic Nutch/Hadoop configuration going. My configuration is 3 Dell 2850's. Each have dual 3.6 ghz processors. The Master node has 8Gb ram and the two Slaves have 5Gb each. The hard drive configuration on the master is dual 128Gb 15k rpm In a raid 0 configuration. I have each of the slave machines set up with VMWare Esxi and each hosts 4 virtual nutch crawlers each getting 1.2 Gb ram and its own 73Gb 10000 RPM scsi drive giving me a total of 8 slaves. I was doing research and wondering if it would be more effective to just run the two slave servers without the virtualization each having 5Gb ram and a larger raid 0?? I was also wondering what settings I can use to maximize the memory usage on the master? I am currently using Rsync because I am still adding plugins and it makes it easier to deploy the plugins to all the machines but if I need to disable it to have a customized configuration on the master node, that is fine. Thanks, Chris

