[ 
https://issues.apache.org/jira/browse/HDFS-2064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054744#comment-13054744
 ] 

Robert Chansler commented on HDFS-2064:
---------------------------------------

When I read K's paper, I found that it did generally fit the model discussed 
among ourselves and in HDFS:1623. I really would consider it a specialization 
of the model. I've re-read both K's and Sanjay+Suresh's papers medium carefully 
to get a better sense of differences.

1. In K7.8 there is a discussion of whether the NN should more proactively look 
for missing replicas. I don't remember discussing this, but my first thought is 
that this is an instance of trying too hard at the margin. During what fraction 
of the system's lifetime would this help?

2. K7.6 mentions turning off lease recovery, but also replication checking as 
in SS9.3.1.

3. What is the scope of VIP solutions? This is the "single switch" question. A 
while ago, we got into trouble when VIP did not just work with HDFS. More 
recently we got into trouble DNS resolution was cached, but when I asked Rajiv 
why VIP wasn't the answer, he said that they could not (in general) provide an 
alternate host with the NN's VIP. Is everybody confident that single-switch VIP 
works well enough for HA? (When I ask that question of Rajiv, he says yes.)

4. We've been very anxious about the stale deletion request problem where a DN 
has a request from the old NN that has not been reported to the NN now in 
service. This is hinted at in SS9.1.2, but I don't think this is fully 
understood yet. SS goes further into the topic of "data node fencing." Sanjay 
and I disagree on the merits. I'd argue that DNs should just do as they are 
told, and not try to mediate sibling disputes among NNs.

5. NN arbitration really is important. K hints somewhere (can't find it now) 
that the old NN must be stopped. SS are more emphatic. I'd say do STONITH. This 
becomes more important anywhere near the word "automatic."

6. SS9.2 mentions "leader election". Is the world really symmetric? XXX^1^ 
denied that symmetry was a good thing. Any specific proposal needs to address 
the question of how alike the first and second systems are, and whether the 
process runs backward.

7. Load Replicator in K6.2 is a new contribution to the discussion. This bears 
on the issue #4, above.

8. Where K really diverges from most discussion here is over the question of 
Backup name node versus spooling edits on secondary storage.  I mostly 
understand the issues, but in a practical BN deployment, is there a remaining 
need for some shared storage?

Why, yes, you could do differently, but a practical solution has,
* VIP
* LinuxHA
* No Zookeeper
* STONITH
* Only transfer one way without administrator intervention


The open argument in my mind is BN versus spool-to-disk. Oh, and if the LR
really means DNs need not know that there are multiple servers, life is
delightfully simpler.

^1^ My memory is uncertain about the proper attribution here. 


> Warm HA NameNode going Hot
> --------------------------
>
>                 Key: HDFS-2064
>                 URL: https://issues.apache.org/jira/browse/HDFS-2064
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: name-node
>    Affects Versions: 0.22.0
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>         Attachments: WarmHA-GoingHot.pdf
>
>
> This is the design for automatic hot HA for HDFS NameNode. It involves use of 
> HA software and LoadReplicator - external to Hadoop components, which 
> substantially simplify the architecture by separating HA- from 
> Hadoop-specific problems. Without the external components it provides warm 
> standby with manual failover.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to