[
https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542834#comment-14542834
]
Aaron T. Myers commented on HDFS-6440:
--------------------------------------
bq. By setting the seed, you get the same sequence nn failures. So one seed
would do 1->2->1->3, while another might do 1->3->2->1. Then, with the seed you
could reproduce the series of failovers in the same order, which seems like a
laudable goal for the test- especially when trying to debug weird error cases.
Unless I'm missing something?
Right, I get the intended purpose, but one of us must be missing something
because I still think there's some funny stuff going on with the
{{FAILOVER_SEED}} variable. :)
In the latest patch, you'll see that the variable {{FAILOVER_SEED}} is used in
the following steps:
# Statically declare {{FAILOVER_SEED}} and initialize it to the value of
{{System.currentTimeMillis()}}
# Statically create {{failoverRandom}} to be a new {{Random}} object,
initialized with the value of {{FAILOVER_SEED}}.
# In a static block, log the value of {{FAILOVER_SEED}}.
# In {{doWriteOverFailoverTest}}, reset the value of {{FAILOVER_SEED}} to again
be {{System.currentTimeMillis()}}.
# Immediately thereafter in {{doWriteOverFailoverTest}}, log the new value of
{{FAILOVER_SEED}}.
Note that there is no step 6 that resets {{failoverRandom}} to use the new
value of {{FAILOVER_SEED}} that was set in step 4, nor is {{FAILOVER_SEED}}
used for anything else after step 5. Thus, unless I'm missing something, seems
like steps 4 and 5 are at least superfluous, and at worst misleading since the
test logs will contain a message about using a random seed that is in fact
never used.
> Support more than 2 NameNodes
> -----------------------------
>
> Key: HDFS-6440
> URL: https://issues.apache.org/jira/browse/HDFS-6440
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: auto-failover, ha, namenode
> Affects Versions: 2.4.0
> Reporter: Jesse Yates
> Assignee: Jesse Yates
> Fix For: 3.0.0
>
> Attachments: Multiple-Standby-NameNodes_V1.pdf,
> hdfs-6440-cdh-4.5-full.patch, hdfs-6440-trunk-v1.patch,
> hdfs-6440-trunk-v1.patch, hdfs-6440-trunk-v3.patch,
> hdfs-multiple-snn-trunk-v0.patch
>
>
> Most of the work is already done to support more than 2 NameNodes (one
> active, one standby). This would be the last bit to support running multiple
> _standby_ NameNodes; one of the standbys should be available for fail-over.
> Mostly, this is a matter of updating how we parse configurations, some
> complexity around managing the checkpointing, and updating a whole lot of
> tests.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)