ranpanfeng created HBASE-22918:
----------------------------------

             Summary: RegionServer violates failfast fault assumption
                 Key: HBASE-22918
                 URL: https://issues.apache.org/jira/browse/HBASE-22918
             Project: HBase
          Issue Type: Bug
            Reporter: ranpanfeng


hbase 2.1.5 is tested and veriflied seriously before it will be deployed in our 
production environment. we give NP(network partition) fault a very important 
care. so NP fault injection tests are conducted in our test environment. Some 
findings are exposed.

I use ycsb to write data  into table SYSTEM:test, which resides on 
regionserver0; during the writting, I use iptables to drop any packet from 
regionserver0 to zookeeper quorums. after

a default zookeeper.session.out(90'), regionserver0 throws YouAreDeadException 
after retries  to connect to zookeeper on TimeoutException error. then, 
regionserver0 suicides itself, before regionserver0 invokes completeFile  on 
WAL, the active master already considered regionserver0 has dead pre-maturely, 
so invokes recoverLease to close the WAL on regionserver0 forcely.

In trusted idc, distributed storage assumes that the error are always 
failstop/failfast faults, there are no Byzantine failures. so in above 
scenario, active master should take over the WAL on regionserver0 after 
regionserver0 is suicided successfully.  According to lease protocol, RS

should suicide in a lease period, and active master should take over the WAL

 after a grace period has elapsed, and invariant "lease period < grace period" 
should always hold.  in hbase-site.xml, only on config property 
"zookeeper.session.timeout" is given,  I think we should provide two properties:

  1. regionserver.zookeeper.session.timeout

  2. master.zookeeper.session.timeout

HBase admin then can tune regionserver.zookeeper.session.timeout less than 
master.zookeeper.session.timeout. In this way, failstop assumption is 
guaranteed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to