[Hadoop Wiki] Update of "NameNode" by SteveLoughran

Apache Wiki Tue, 05 Aug 2008 02:06:48 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.


The following page has been changed by SteveLoughran:
http://wiki.apache.org/hadoop/NameNode

The comment on the change is:
creating a page

New page:
The NameNode is the centerpiece of an HDFS filesystem. It keeps the directory 
tree of all files in the filesystem, and tracks where across the cluster the 
files are kept. It does not store any of these files itself.

Client applications talk to the NameNode whenever they wish to locate a file, 
or when they want to add/copy/move/delete a file. The NameNode responds the 
successful requests by returning a list of relevant DataNode servers where the 
data lives.

The NameNode is a Single Point of Failure for the HDFS Cluster. There is 
support for NameNodeFailover, with a SecondaryNameNode hosted on a separate 
machine being able to stand in for the original NameNode if it goes down. 
However, HDFS is not currently a HighAvailability filesystem. When the NameNode 
goes down, the filesystem goes offline.

It is essential to look after the NameNode. Here are some recommendations from 
production use

 * Use a good server with lots (15GB+) of RAM.
 * Use fast RAID5 storage for keeping the index.
 * Configure the name node to store one set of transaction logs on a separate 
disk from the index. 
 * Configure the name node to store another set of transaction logs to a 
network mounted disk.  
 * Monitor the disk space available to the NameNode. If is getting low, add 
more storage.
 * Do not host DataNode, JobTracker or TaskTracker services on the same system.

If a NameNode does not start up, look at the TroubleShooting page.

[Hadoop Wiki] Update of "NameNode" by SteveLoughran

Reply via email to