stack created HBASE-9734:
----------------------------
Summary: Save 3-4 seconds by having master purge znode rather than
wait on RS exit
Key: HBASE-9734
URL: https://issues.apache.org/jira/browse/HBASE-9734
Project: HBase
Issue Type: Improvement
Components: MTTR
Reporter: stack
Priority: Critical
If RS is aborting (in my current case because SSR and ran out of DM), it will
tell the Master its exiting by calling reportRSFatalError on the Master
Interface. Master adds the RS to its list of fatal regionservers but that is
about it.
RS tries to clean up best as it can and exit out quickly but if carrying
regions it can be seconds before it gets to the purge of its ephemeral node,
and then the Master needs to notice it and only then can it start in on log
splitting.
RS should purge ephemeral node immediately on abort or Master needs to do it
and start log splitting as soon as the RS reports fatal error. In my case here
would save at least 4 seconds, a small cluster, with only a few regions so
there is more to be had in a bigger setup.
--
This message was sent by Atlassian JIRA
(v6.1#6144)