[ 
https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853485#action_12853485
 ] 

Sanjay Radia commented on HDFS-1073:
------------------------------------

I forgot to mention other advantages.
*  The image does not need to be sent back to the primary NN via a special 
mechanism. Once can simply copy it back using any tool. 
* If for HA one wants to keep the images and logs on shared storage (harder for 
logs) the checkpointer can simply copy the checkpointed image to the shared 
storage without involving the primary. 
* Rob Chancellor pointed another advantage : a server with large memory can 
simply run a checkpoint cmd as cron jobs for *ALL* NNs in the data center (esp 
useful under federated NNs). The disadvantage is that it would require a 
separate server; Further, i believe, it simplifies the backup NN code:
**  currently when the backup starts a checkpoint it has to lock the fs state 
and store the new logs sent by the primary to a special place and then do 
something special to sync back in; I think this resync would not be necessary 
if we use a special server to run periodic checkpoints.

> Simpler model for Namenode's fs Image and edit Logs 
> ----------------------------------------------------
>
>                 Key: HDFS-1073
>                 URL: https://issues.apache.org/jira/browse/HDFS-1073
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Sanjay Radia
>            Assignee: Todd Lipcon
>
> The naming and handling of  NN's fsImage and edit logs can be significantly 
> improved resulting simpler and more robust code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to