[ http://issues.apache.org/jira/browse/HADOOP-227?page=comments#action_12457507 ] dhruba borthakur commented on HADOOP-227: -----------------------------------------
The Backup Namenode Proposal -------------------------------------------- The idea is to create a backup namenode, download the fsimage and the edits file to the backup namenode, merge them into a single image and then upload the newly created image into the primary namenode. This approach has the following advantages: 1. No additional memory or CPU requirement for the primary namenode. 2. Good scalability, backup namenodes can be plugged into the network on demand. 3. Address space separation of primary namenode and backup namenode, thus better fault tolerance. The namenode when invoked with the "-backupmode" command line option functions as the backup namenode. No additional scripts needed. One can run the backup namenode and the primary namenode on the same physical machine. The backup namenode downloads the fsimage and the edits from the primary namenode through a http-get message. The primary namenode rolls the edit file on disk, send starts logging new transactions into the new editlog file. The backup namenode merges the downloaded fsimage and edit into a new image file. It then uploads the new image file to the primary namenode. The primary namenode replaces the old fsimage and the old editlog with the new uploaded fsimage. > Namespace check pointing is not performed until the namenode restarts. > ---------------------------------------------------------------------- > > Key: HADOOP-227 > URL: http://issues.apache.org/jira/browse/HADOOP-227 > Project: Hadoop > Issue Type: Bug > Components: dfs > Affects Versions: 0.2.0 > Reporter: Konstantin Shvachko > Assigned To: Milind Bhandarkar > Attachments: patch-async-checkpoints-0.9.0, > patch-async-checkpoints-0.9.0, patch-async-checkpoints-0.9.0 > > > In current implementation when the name node starts, it reads its image file, > then > the edits file, and then saves the updated image back into the image file. > The image file is never updated after that. > In order to provide the system reliability reliability the namespace > information should > be check pointed periodically, and the edits file should be kept relatively > small. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira