[
https://issues.apache.org/jira/browse/HDFS-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14031686#comment-14031686
]
Maysam Yabandeh commented on HDFS-6469:
---------------------------------------
Very interesting document indeed! I think as a community we should always keep
this option on the table and revisit it once in a while to reevaluate the pros
and cons.
Before giving detailed comments, I first want to make sure that I correctly
understand the big picture. Does the jira suggest the following path:
# lets make changes to have hdfs ready for pluggable consensus
# people start with a bad implementation of consensus with poor performance
# then probably a hero comes along who has a secret way of making consensus
efficient
If the above picture is correct, then the concern would be whether everybody in
the community benefits from the cost of doing step 1 and perhaps being stuck
with step 2. I would be much more comfortable when I see numbers in terms of
latency, throughput, and last but not least code complexity. That should make
it easier to convince the community about the suggested path.
About the performance, here are some concerns:
# How much would be the increased delay for write operations, both avg and
stddev.
# Does consensus negatively impacts the *write* throughput of NN? Paxos
requires *many* messages per proposal to be exchanged between participants,
which consumes the CPU and network of the NN.
# Does the load on DN scale with number of CNs? If I understand correctly each
DN has to send changes to all CNs. How much the overhead on DN would be when we
have seven CNs?
# Due to performance issues, in practice we see Multi-Paxos implemented instead
of Paxos, in which a proposer assumes the role of the leader for a time period
specified by a lease. In this case, the failure of the leader still makes NN
unavailable until the new leader is elected. I wonder whether this would give
any advantage over the current failover delay between primary and standby. This
concern would be of course invalid if you offer an efficient solution for
consensus that does not rely on leases.
> Coordinated replication of the namespace using ConsensusNode
> ------------------------------------------------------------
>
> Key: HDFS-6469
> URL: https://issues.apache.org/jira/browse/HDFS-6469
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: namenode
> Affects Versions: 3.0.0
> Reporter: Konstantin Shvachko
> Assignee: Konstantin Shvachko
> Attachments: CNodeDesign.pdf
>
>
> This is a proposal to introduce ConsensusNode - an evolution of the NameNode,
> which enables replication of the namespace on multiple nodes of an HDFS
> cluster by means of a Coordination Engine.
--
This message was sent by Atlassian JIRA
(v6.2#6252)