Ming Ma created HADOOP-11000:
--------------------------------
Summary: HAServiceProtocol's health state is incorrectly
transitioned to SERVICE_NOT_RESPONDING
Key: HADOOP-11000
URL: https://issues.apache.org/jira/browse/HADOOP-11000
Project: Hadoop Common
Issue Type: Bug
Reporter: Ming Ma
When HAServiceProtocol.monitorHealth throws a HealthCheckFailedException, the
actual exception from protocol buffer RPC is a RemoteException that wraps the
real exception. Thus the state is incorrectly transitioned to
SERVICE_NOT_RESPONDING
{noformat}
HealthMonitor.java
doHealthChecks
try {
status = proxy.getServiceStatus();
proxy.monitorHealth();
healthy = true;
} catch (HealthCheckFailedException e) {
.....
enterState(State.SERVICE_UNHEALTHY);
} catch (Throwable t) {
.....
enterState(State.SERVICE_NOT_RESPONDING);
.....
}
{noformat}
--
This message was sent by Atlassian JIRA
(v6.2#6252)