Hello Roman ,

If you have huge cluster then its good to have JobTracker and NameNode running 
on different machines .
If your cluster is small enough ( ~<20-30 machines ) then you can run 
JobTracker and NameNode on same machines .
Again it depends on hardware configuration . Usually  NameNode and Jobtracker 
machines have higher configuration compared to data nodes.


It depends on how big is your cluster and how big is your HDFS data .
NameNode memory usage  is directly proportional to the size  of HDFS and number 
of files/directories on HDFS.  Each file/directory's metadata and inode 
information is stored in NameNode namespace(stored in main memory) which is 
directly proportional to the number of files and directories on HDFS  . If you 
go by byte size used for storing metadata of HDFS file stored in Namespace  
NameNode memory requirements can be summarized as  "10 million files require 4 
GB of memory for NameNode"

For a small cluster you can have  NameNode and JobTracker running on the same 
machine .

Regards,
-
Ravi

On 7/20/09 6:25 PM, "roman kolcun" <[email protected]> wrote:

Hello everyone,
is there any performance difference (or any advantage / disadvantage) in
colocating NameNode and JobTracker on the same node? Is it better to put
them on different nodes or on the same one?

Thank you for your answers.

Yours Sincerely,
Roman

Reply via email to