[
https://issues.apache.org/jira/browse/HDFS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489627#comment-13489627
]
Todd Lipcon commented on HDFS-4114:
-----------------------------------
{quote}
I see BackupNode as a better way of creating checkpoints. SNN uploads the image
and the edits from NN, merges them in memory and then sends back the new
checkpoint.
BN needs only to saveNamespace() from memory and then sends back the new image.
This reduces the network traffic and local disk IOs on the upload of two large
files. I have seen on multiple large clusters NameNode running much slower,
when the checkpoint is in progress.
It is beneficial for HDFS performance to switch from SNN to BN for
checkpointing. Therefore I would advocate re-re-deprecating SNN instead of
removing BN.
{quote}
This argument seems to be predicated on an idea that the SecondaryNameNode
doesn't keep the image in memory between checkpoints, and that it downloads the
image from the NN anew for each checkpoint. This hasn't been the case since
HDFS-1458 in 0.23, which made a small improvement to the 2NN to solve the
problem you're pointing out.
{quote}
I would be glad to go into design discussion and potential enhancements of
BackupNode with you. Would appreciate it given your experience with HA, as I
believe the HA story for Hadoop isn't over with the implementation of Quorum
Journal.
{quote}
Feel free to ping me if you have any questions on the HA design or
implementation - always happy to help.
{quote}
Although this issue is not about it. Sticking to the point, what are your
arguments for removing (or better say deprecating) BN besides that it has bugs?
Software tends to have bugs. E.g. you do not propose to remove BlockScanner
just because it couldn't been fixed over a series jiras.
{quote}
The BackupNode doesn't provide any feature that is not provided better by other
pieces of code. Your argument about efficiency isn't valid given HDFS-1458.
The BlockScanner argument is a silly one: it has had some bugs, but there is no
alternative available which _doesn't_ have bugs, so a buggy piece of code is
better than no piece of code. If someone had written a new BlockScanner which
offered more features and fewer bugs, I'd absolutely advocate removing it.
> Remove the CheckpointNode
> -------------------------
>
> Key: HDFS-4114
> URL: https://issues.apache.org/jira/browse/HDFS-4114
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Eli Collins
> Assignee: Eli Collins
>
> Per the thread on hdfs-dev@ (http://s.apache.org/tMT) let's remove the
> BackupNode and CheckpointNode.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira