Re: Colocation of NameNode and JobTracker

Steve Loughran Tue, 21 Jul 2009 04:40:52 -0700

Ravi Phulari wrote:

Hello Roman ,


If you have huge cluster then its good to have JobTracker and NameNode running 
on different machines .
If your cluster is small enough ( ~<20-30 machines ) then you can run 
JobTracker and NameNode on same machines .
Again it depends on hardware configuration . Usually  NameNode and Jobtracker 
machines have higher configuration compared to data nodes.


It depends on how big is your cluster and how big is your HDFS data .
NameNode memory usage  is directly proportional to the size  of HDFS and number of 
files/directories on HDFS.  Each file/directory's metadata and inode information is 
stored in NameNode namespace(stored in main memory) which is directly proportional to the 
number of files and directories on HDFS  . If you go by byte size used for storing 
metadata of HDFS file stored in Namespace  NameNode memory requirements can be summarized 
as  "10 million files require 4 GB of memory for NameNode"

For a small cluster you can have  NameNode and JobTracker running on the same 
machine .

I'd start off with two DNS entries "namenode" and "jobtracker" bothpointing to the same box. If you need to split the machines later, allbookmarked URLs and configuration files will remain the same, which willkeep users happier.

Re: Colocation of NameNode and JobTracker

Reply via email to