apurtell commented on a change in pull request #3027:
URL: https://github.com/apache/hbase/pull/3027#discussion_r596277274
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java
##########
@@ -61,11 +64,23 @@ public void preClean() {
// but they won't be deleted because they're not in the checking set.
wals = queueStorage.getAllWALs();
} catch (ReplicationException e) {
- LOG.warn("Failed to read zookeeper, skipping checking deletable files");
+ LOG.warn("Failed to read zookeeper, skipping checking deletable files",
e);
+ handleSessionExpiredException(e);
wals = null;
}
}
+ /*
+ If preClean call encounters SessionExpiredException then we should abort
HMaster.
+ */
+ private void handleSessionExpiredException(ReplicationException e) {
+ if (e.getCause() != null && e.getCause() instanceof
KeeperException.SessionExpiredException) {
Review comment:
There are other KeeperException reasons that should probably be
considered fatal:
KeeperException.APIErrorException: Not expected, we've messed up in some
probably unrecoverable way
KeeperException.AuthFailedException: We don't have access to znodes we need
to have access to
KeeperException.InvalidACLException: Same, someone messed up the znode ACLs
on znodes we need
KeeperException.InvalidCallbackException: Not expected, we've messed up in
some probably unrecoverable way
KeeperException.NoAuthException: We don't have access to znodes we need to
have access to
KeeperException.RuntimeInconsistencyException: ZK data is corrupt
KeeperException.SessionMovedException: Maybe this is something that RZK
could handle by chasing the session via reconnects, but we don't do it today I
don't believe, so is as bad as SessionExpired?
KeeperException.SystemErrorException: ZK server failure
KeeperException.UnimplementedException: Not expected, we've messed up in
some probably unrecoverable way
and of course KeeperException.SessionExpiredException
Should these be handled too? Consider a ZKUtil static helper method ... call
it ZKUtil#isFatalKeeperException ... include these cases and call it.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]