Bharat Viswanadham created HDDS-1476:
----------------------------------------
Summary: Fix logIfNeeded logic in EndPointStateMachine
Key: HDDS-1476
URL: https://issues.apache.org/jira/browse/HDDS-1476
Project: Hadoop Distributed Data Store
Issue Type: Bug
Reporter: Bharat Viswanadham
{code:java}
public void E(Exception ex) {
LOG.trace("Incrementing the Missed count. Ex : {}", ex);
this.incMissed();
if (this.getMissedCount() % getLogWarnInterval(conf) ==
0) {
LOG.error(
"Unable to communicate to SCM server at {} for past {} seconds.",
this.getAddress().getHostString() + ":" + this.getAddress().getPort(),
TimeUnit.MILLISECONDS.toSeconds(
this.getMissedCount() * getScmHeartbeatInterval(this.conf)), ex);
}
}{code}
This method will be called when any exception occur in stateMachine to log an
exception. But to not log aggresively we have this
ozone.scm.heartbeat.log.warn.interval.count property to control logging.
There is a small issue here, we don't log the exception first time when it
occurred. So, we need to log for the first time and then increment the
missingCount.
Fix is to move the this.incMissed() to end of the method so that we log it for
the first time exception occurred and after that every log.warn.interval.count
exceptions happened.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]