[
https://issues.apache.org/jira/browse/HDFS-12979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16834049#comment-16834049
]
Erik Krogen commented on HDFS-12979:
------------------------------------
Good catch, [~vagarychen]. I wonder if we can solve it more simply by saying
that, instead of having one system-wide primary checkpointer, each _receiving_
node has a corresponding primary checkpointer. On the SbNN side, maintain a map
of {{nnAddress -> isPrimaryCheckPointer}}. Each SbNN can be the primary for 0-n
NNs, and each active/observer NN should have exactly 1 primary.
> StandbyNode should upload FsImage to ObserverNode after checkpointing.
> ----------------------------------------------------------------------
>
> Key: HDFS-12979
> URL: https://issues.apache.org/jira/browse/HDFS-12979
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: hdfs
> Reporter: Konstantin Shvachko
> Assignee: Chen Liang
> Priority: Major
> Attachments: HDFS-12979.001.patch
>
>
> ObserverNode does not create checkpoints. So it's fsimage file can get very
> old making bootstrap of ObserverNode too long. A StandbyNode should copy
> latest fsimage to ObserverNode(s) along with ANN.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]