ranpanfeng created HBASE-22918:
----------------------------------
Summary: RegionServer violates failfast fault assumption
Key: HBASE-22918
URL: https://issues.apache.org/jira/browse/HBASE-22918
Project: HBase
Issue Type: Bug
Reporter: ranpanfeng
hbase 2.1.5 is tested and veriflied seriously before it will be deployed in our
production environment. we give NP(network partition) fault a very important
care. so NP fault injection tests are conducted in our test environment. Some
findings are exposed.
I use ycsb to write data into table SYSTEM:test, which resides on
regionserver0; during the writting, I use iptables to drop any packet from
regionserver0 to zookeeper quorums. after
a default zookeeper.session.out(90'), regionserver0 throws YouAreDeadException
after retries to connect to zookeeper on TimeoutException error. then,
regionserver0 suicides itself, before regionserver0 invokes completeFile on
WAL, the active master already considered regionserver0 has dead pre-maturely,
so invokes recoverLease to close the WAL on regionserver0 forcely.
In trusted idc, distributed storage assumes that the error are always
failstop/failfast faults, there are no Byzantine failures. so in above
scenario, active master should take over the WAL on regionserver0 after
regionserver0 is suicided successfully. According to lease protocol, RS
should suicide in a lease period, and active master should take over the WAL
after a grace period has elapsed, and invariant "lease period < grace period"
should always hold. in hbase-site.xml, only on config property
"zookeeper.session.timeout" is given, I think we should provide two properties:
1. regionserver.zookeeper.session.timeout
2. master.zookeeper.session.timeout
HBase admin then can tune regionserver.zookeeper.session.timeout less than
master.zookeeper.session.timeout. In this way, failstop assumption is
guaranteed.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)