[
https://issues.apache.org/jira/browse/HDFS-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16807719#comment-16807719
]
star commented on HDFS-14378:
-----------------------------
Initial patch for review. [~xkrogen] would you like to make a review for the
patch?
Don't have too much time these days. Maybe there are better solutions for this
patch. I'd like to have some suggestion for this.
> Simplify the design of multiple NN and both logic of edit log roll and
> checkpoint
> ---------------------------------------------------------------------------------
>
> Key: HDFS-14378
> URL: https://issues.apache.org/jira/browse/HDFS-14378
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namenode
> Reporter: star
> Assignee: star
> Priority: Minor
> Attachments: HDFS-14378-trunk.001.patch, HDFS-14378-trunk.002.patch
>
>
> HDFS-6440 introduced a mechanism to support more than 2 NNs. It
> implements a first-writer-win policy to avoid duplicated fsimage downloading.
> Variable 'isPrimaryCheckPointer' is used to hold the first-writer state, with
> which SNN will provide fsimage for ANN next time. Then we have three roles in
> NN cluster: ANN, one primary SNN, one or more normal SNN.
> Since HDFS-12248, there may be more than two primary SNN shortly after
> a exception occurred. It takes care with a scenario that SNN will not upload
> fsimage on IOE and Interrupted exceptions. Though it will not cause any
> further functional issues, it is inconsistent.
> Futher more, edit log may be rolled more frequently than necessary with
> multiple Standby name nodes, HDFS-14349. (I'm not so sure about this, will
> verify by unit tests or any one could point it out.)
> Above all, I‘m wondering if we could make it simple with following
> changes:
> * There are only two roles:ANN, SNN
> * ANN will roll its edit log every DFS_HA_LOGROLL_PERIOD_KEY period.
> * ANN will select a SNN to download checkpoint.
> SNN will just do logtail and checkpoint. Then provide a servlet for fsimage
> downloading as normal. SNN will not try to roll edit log or send checkpoint
> request to ANN.
> In a word, ANN will be more active. Suggestions are welcomed.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]