[jira] [Commented] (HBASE-10296) Replace ZK with a paxos running within master processes to provide better master failover performance and state consistency

Andrew Purtell (JIRA) Sun, 12 Jan 2014 11:13:23 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-10296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869117#comment-13869117
 ]


Andrew Purtell commented on HBASE-10296:
----------------------------------------

Also if this conversation morphs to a "let's P2P HBase", that would be out of 
scope for this JIRA which is "Replace ZK with a paxos running within master 
processes to provide better master failover performance and state consistency". 
Perhaps create or use another JIRA for that discussion? 

I will say this though: One crucial difference between a Dynamo or Cassandra 
install versus HBase is they do their own persistence. Maybe a HBase with a 
fully homogeneous set of behaviors would be useful, but unless running on an 
equivalently distributed persistence layer, we would still operationally be 
subject to HDFS's master-slave architecture. And a master-slave architecture 
has some distinct advantages over a P2P one without central control: Under 
failure conditions, it is easier to get things under control with one commander 
giving orders rather than relying on the convergence of emergent behaviors.

> Replace ZK with a paxos running within master processes to provide better 
> master failover performance and state consistency
> ---------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-10296
>                 URL: https://issues.apache.org/jira/browse/HBASE-10296
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: master, Region Assignment, regionserver
>            Reporter: Feng Honghua
>
> Currently master relies on ZK to elect active master, monitor liveness and 
> store almost all of its states, such as region states, table info, 
> replication info and so on. And zk also plays as a channel for 
> master-regionserver communication(such as in region assigning) and 
> client-regionserver communication(such as replication state/behavior change). 
> But zk as a communication channel is fragile due to its one-time watch and 
> asynchronous notification mechanism which together can leads to missed 
> events(hence missed messages), for example the master must rely on the state 
> transition logic's idempotence to maintain the region assigning state 
> machine's correctness, actually almost all of the most tricky inconsistency 
> issues can trace back their root cause to the fragility of zk as a 
> communication channel.
> Replace zk with paxos running within master processes have following benefits:
> 1. better master failover performance: all master, either the active or the 
> standby ones, have the same latest states in memory(except lag ones but which 
> can eventually catch up later on). whenever the active master dies, the newly 
> elected active master can immediately play its role without such failover 
> work as building its in-memory states by consulting meta-table and zk.
> 2. better state consistency: master's in-memory states are the only truth 
> about the system,which can eliminate inconsistency from the very beginning. 
> and though the states are contained by all masters, paxos guarantees they are 
> identical at any time.
> 3. more direct and simple communication pattern: client changes state by 
> sending requests to master, master and regionserver talk directly to each 
> other by sending request and response...all don't bother to using a 
> third-party storage like zk which can introduce more uncertainty, worse 
> latency and more complexity.
> 4. zk can only be used as liveness monitoring for determining if a 
> regionserver is dead, and later on we can eliminate zk totally when we build 
> heartbeat between master and regionserver.
> I know this might looks like a very crazy re-architect, but it deserves deep 
> thinking and serious discussion for it, right?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HBASE-10296) Replace ZK with a paxos running within master processes to provide better master failover performance and state consistency

Reply via email to