[
https://issues.apache.org/jira/browse/HDFS-2064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054744#comment-13054744
]
Robert Chansler commented on HDFS-2064:
---------------------------------------
When I read K's paper, I found that it did generally fit the model discussed
among ourselves and in HDFS:1623. I really would consider it a specialization
of the model. I've re-read both K's and Sanjay+Suresh's papers medium carefully
to get a better sense of differences.
1. In K7.8 there is a discussion of whether the NN should more proactively look
for missing replicas. I don't remember discussing this, but my first thought is
that this is an instance of trying too hard at the margin. During what fraction
of the system's lifetime would this help?
2. K7.6 mentions turning off lease recovery, but also replication checking as
in SS9.3.1.
3. What is the scope of VIP solutions? This is the "single switch" question. A
while ago, we got into trouble when VIP did not just work with HDFS. More
recently we got into trouble DNS resolution was cached, but when I asked Rajiv
why VIP wasn't the answer, he said that they could not (in general) provide an
alternate host with the NN's VIP. Is everybody confident that single-switch VIP
works well enough for HA? (When I ask that question of Rajiv, he says yes.)
4. We've been very anxious about the stale deletion request problem where a DN
has a request from the old NN that has not been reported to the NN now in
service. This is hinted at in SS9.1.2, but I don't think this is fully
understood yet. SS goes further into the topic of "data node fencing." Sanjay
and I disagree on the merits. I'd argue that DNs should just do as they are
told, and not try to mediate sibling disputes among NNs.
5. NN arbitration really is important. K hints somewhere (can't find it now)
that the old NN must be stopped. SS are more emphatic. I'd say do STONITH. This
becomes more important anywhere near the word "automatic."
6. SS9.2 mentions "leader election". Is the world really symmetric? XXX^1^
denied that symmetry was a good thing. Any specific proposal needs to address
the question of how alike the first and second systems are, and whether the
process runs backward.
7. Load Replicator in K6.2 is a new contribution to the discussion. This bears
on the issue #4, above.
8. Where K really diverges from most discussion here is over the question of
Backup name node versus spooling edits on secondary storage. I mostly
understand the issues, but in a practical BN deployment, is there a remaining
need for some shared storage?
Why, yes, you could do differently, but a practical solution has,
* VIP
* LinuxHA
* No Zookeeper
* STONITH
* Only transfer one way without administrator intervention
The open argument in my mind is BN versus spool-to-disk. Oh, and if the LR
really means DNs need not know that there are multiple servers, life is
delightfully simpler.
^1^ My memory is uncertain about the proper attribution here.
> Warm HA NameNode going Hot
> --------------------------
>
> Key: HDFS-2064
> URL: https://issues.apache.org/jira/browse/HDFS-2064
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: name-node
> Affects Versions: 0.22.0
> Reporter: Konstantin Shvachko
> Assignee: Konstantin Shvachko
> Attachments: WarmHA-GoingHot.pdf
>
>
> This is the design for automatic hot HA for HDFS NameNode. It involves use of
> HA software and LoadReplicator - external to Hadoop components, which
> substantially simplify the architecture by separating HA- from
> Hadoop-specific problems. Without the external components it provides warm
> standby with manual failover.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira