[ 
https://issues.apache.org/jira/browse/HBASE-23206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Kyle Purtell resolved HBASE-23206.
-----------------------------------------
    Resolution: Won't Fix

> ZK quorum redundancy with failover in RZK
> -----------------------------------------
>
>                 Key: HBASE-23206
>                 URL: https://issues.apache.org/jira/browse/HBASE-23206
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Andrew Kyle Purtell
>            Priority: Major
>
> We have faced a few production issues where the reliability of the ZooKeeper 
> quorum serving the cluster has not been as robust as expected. The most 
> recent one was essentially ZOOKEEPER-2164 (and related: ZOOKEEPER-900). These 
> can be mitigated by a ZK server configuration change but the incidents 
> suggest it may be worth thinking about how to be less reliant on the service 
> provided by a single ZK quorum instance. 
> A solution would be holistic with several parts:
> - HBASE-18095 to get ZK dependencies out of the client
> - Related HBase replication improvements to track peer and position state in 
> HBase tables instead of znodes
> - This brainstorming...
> For this issue, RecoverableZooKeeper (RZK) might be taught how to speak to 
> two separate ZK quorum redundantly, so ZK client operations via RZK succeed 
> even if one of them is temporarily unable to provide service. The loss of one 
> of a pair (or more) of redundant quorums would no longer impact availability 
> of the HBase service. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to