[ 
https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542834#comment-14542834
 ] 

Aaron T. Myers commented on HDFS-6440:
--------------------------------------

bq. By setting the seed, you get the same sequence nn failures. So one seed 
would do 1->2->1->3, while another might do 1->3->2->1. Then, with the seed you 
could reproduce the series of failovers in the same order, which seems like a 
laudable goal for the test- especially when trying to debug weird error cases. 
Unless I'm missing something?

Right, I get the intended purpose, but one of us must be missing something 
because I still think there's some funny stuff going on with the 
{{FAILOVER_SEED}} variable. :)

In the latest patch, you'll see that the variable {{FAILOVER_SEED}} is used in 
the following steps:

# Statically declare {{FAILOVER_SEED}} and initialize it to the value of 
{{System.currentTimeMillis()}}
# Statically create {{failoverRandom}} to be a new {{Random}} object, 
initialized with the value of {{FAILOVER_SEED}}.
# In a static block, log the value of {{FAILOVER_SEED}}.
# In {{doWriteOverFailoverTest}}, reset the value of {{FAILOVER_SEED}} to again 
be {{System.currentTimeMillis()}}.
# Immediately thereafter in {{doWriteOverFailoverTest}}, log the new value of 
{{FAILOVER_SEED}}.

Note that there is no step 6 that resets {{failoverRandom}} to use the new 
value of {{FAILOVER_SEED}} that was set in step 4, nor is {{FAILOVER_SEED}} 
used for anything else after step 5. Thus, unless I'm missing something, seems 
like steps 4 and 5 are at least superfluous, and at worst misleading since the 
test logs will contain a message about using a random seed that is in fact 
never used.

> Support more than 2 NameNodes
> -----------------------------
>
>                 Key: HDFS-6440
>                 URL: https://issues.apache.org/jira/browse/HDFS-6440
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: auto-failover, ha, namenode
>    Affects Versions: 2.4.0
>            Reporter: Jesse Yates
>            Assignee: Jesse Yates
>             Fix For: 3.0.0
>
>         Attachments: Multiple-Standby-NameNodes_V1.pdf, 
> hdfs-6440-cdh-4.5-full.patch, hdfs-6440-trunk-v1.patch, 
> hdfs-6440-trunk-v1.patch, hdfs-6440-trunk-v3.patch, 
> hdfs-multiple-snn-trunk-v0.patch
>
>
> Most of the work is already done to support more than 2 NameNodes (one 
> active, one standby). This would be the last bit to support running multiple 
> _standby_ NameNodes; one of the standbys should be available for fail-over.
> Mostly, this is a matter of updating how we parse configurations, some 
> complexity around managing the checkpointing, and updating a whole lot of 
> tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to