On 22/09/11 05:42, praveenesh kumar wrote:
Hi all,

Can we replace our namenode machine later with some other machine. ?
Actually I got a new  server machine in my cluster and now I want to make
this machine as my new namenode and jobtracker node ?
Also Does Namenode/JobTracker machine's configuration needs to be better
than datanodes/tasktracker's ??


1. I'd give it lots of RAM - holding data about many files, avoiding swapping, etc.

2. I'd make sure the disks are RAID5, with some NFS-mounted FS that the secondary namenode can talk to. avoids risk of loss of the index, which, if it happens, renders your filesystem worthless. If I was really paranoid I'd have twin raid controllers with separate connections to disk arrays in separate racks, as [Jiang2008] shows that interconnect problems on disk arrays can be higher than HDD failures.

3. if your central switches are at 10 GbE, consider getting a 10GbE NIC and hooking it up directly -this stops the network being the bottleneck, though it does mean the server can have a lot more packets hitting it, so putting more load on it.

4. Leave space for a second CPU and time for GC tuning.


JT's are less important; they need RAM but use HDFS for storage. If your cluster is small, NN and JT can be run locally. If you do this, set up DNS to have two hostnames to point to same network address. Then if you ever split them off, everyone whose bookmark says http://jobtracker won't notice

Either way: the NN and the JT are the machines whose availability you care about. The rest is just a source of statistics you can look at later.

-Steve



[Jiang2008] "Are disks the dominant contributor for storage failures?: A comprehensive study of storage subsystem failure characteristics". ACM Transactions on Storage.

Reply via email to