[
https://issues.apache.org/jira/browse/HADOOP-14989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16236260#comment-16236260
]
Eric Yang commented on HADOOP-14989:
------------------------------------
Hi [~xkrogen] Your observation is correct. However, {{MetricsSourceAdapter}}
can not call {{updateJmxCache}} at end of {{getMetrics}}. It will just
deadlock because {{updateJmxCache}} calls {{getMetrics}}.
{{MetricsSystemImpl}} was not used by JMX to avoid a dead lock in the timer
thread where {{MetricsSourceAdapter}} lock and is trying to grab the
{{MetricsSystemImpl}} lock. The locking order isn't consistent in the "push and
pull" part of {{MetricsSourceAdapter}} so it can deadlocked.
In your second suggestion, store the return value of {{getMetrics}} and use
that to populate jmx cache, this is the correct logic, in a push vs pull
system. We need to be careful in the synchronization of cache value to MBean
or it can cause mbean to fail with null value. HADOOP-11361 has some of the
background information of how the system arrived at the current state. There
is a new ReentrantLock utility in Java 7 which might help to reduce the
deadlock in publishing metrics and retrieved cache by JMX. This might be one
way to solve the race condition and produce more accurate data for JMX.
HADOOP-12594 had an attempt in removing the deadlock, and it might be useful
background information on how to solve this the proper way.
> metrics2 JMX cache refresh result in inconsistent Mutable(Stat|Rate) values
> ---------------------------------------------------------------------------
>
> Key: HADOOP-14989
> URL: https://issues.apache.org/jira/browse/HADOOP-14989
> Project: Hadoop Common
> Issue Type: Bug
> Components: metrics
> Affects Versions: 2.6.5
> Reporter: Erik Krogen
> Assignee: Erik Krogen
> Priority: Critical
> Attachments: HADOOP-14989.test.patch
>
>
> While doing some digging in the metrics2 system recently, we noticed that the
> way {{MutableStat}} values are collected (and thus {{MutableRate}}, since it
> is based off of {{MutableStat}}) mean that every time the value is
> snapshotted, all previous information is lost. So every time a JMX cache
> refresh occurs, it resets the {{MutableStat}}, meaning that all configured
> metrics sinks do not consider the previous statistics in their emitted
> values. The same behavior is true if you configured multiple sink periods.
> {{MutableStat}}, to compute its average value, maintains a total value since
> last snapshot, as well as operation count since last snapshot. Upon
> snapshotting, the average is calculated as (total / opCount) and placed into
> a gauge metric, and total / operation count are cleared. So the average value
> represents the average since the last snapshot. If we have only a single sink
> period ever snapshotting, this would result in the expected behavior that the
> value is the average over the reporting period. However, if multiple sink
> periods are configured, or if the JMX cache is refreshed, this is another
> snapshot operation. So, for example, if you have a FileSink configured at a
> 60 second interval and your JMX cache refreshes itself 1 second before the
> FileSink period fires, the values emitted to your FileSink only represent
> averages _over the last one second_.
> A few ways to solve this issue:
> * Make {{MutableRate}} manage its own average refresh, similar to
> {{MutableQuantiles}}, which has a refresh thread and saves a snapshot of the
> last quantile values that it will serve up until the next refresh. Given how
> many {{MutableRate}} metrics there are, a thread per metric is not really
> feasible, but could be done on e.g. a per-source basis. This has some
> downsides: if multiple sinks are configured with different periods, what is
> the right refresh period for the {{MutableRate}}?
> * Make {{MutableRate}} emit two counters, one for total and one for operation
> count, rather than an average gauge and an operation count counter. The
> average could then be calculated downstream from this information. This is
> cumbersome for operators and not backwards compatible. To improve on both of
> those downsides, we could have it keep the current behavior but
> _additionally_ emit the total as a counter. The snapshotted average is
> probably sufficient in the common case (we've been using it for years), and
> when more guaranteed accuracy is required, the average could be derived from
> the total and operation count.
> The two above suggestions will fix this for both JMX and multiple sink
> periods, but may be overkill. Multiple sink periods are probably not
> necessary though we should at least document the behavior.
> Open to suggestions & input here.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]