[
https://issues.apache.org/jira/browse/HADOOP-6508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805355#action_12805355
]
Amareshwari Sriramadasu commented on HADOOP-6508:
-------------------------------------------------
Analyzing the following JobTrackerMetricsInst code with CompositeContext as
the MetricsContext:
{code}
MetricsContext context = MetricsUtil.getContext("mapred");
metricsRecord = MetricsUtil.createRecord(context, "jobtracker");
metricsRecord.setTag("sessionId", sessionId);
context.registerUpdater(this);
{code}
Details on each line of code:
{code}
MetricsContext context = MetricsUtil.getContext("mapred");
{code}
This code creates a CompositeContext(CC), which creates all its sub-contexts
and calls startsMonitoring on all the
subcontext. Thus there are as many threads(monitoring) as the number of
sub-contexts. Here, each thread calls
doUpdates() followed by emitRecords() in the configured periods.
{code}
metricsRecord = MetricsUtil.createRecord(context, "jobtracker");
{code}
This code creates a MetricsRecord for CompositeContext, which is a Proxy which
has a delegator for all the sub-records. This record invokes
every method call on all its sub-records.
{code}
context.registerUpdater(this);
{code}
This code registers JobTracker as the updater for all the sub-contexts.
Putting above relation pictorially (see the attached png): JobTracker (JT) has
CompositeContext(CC) and ProxyMetricsRecord (PR). CC starts
Context1(C1) and Context2(C2) 's timer threads. C1 has MetricsRecord (R1) and
C2 has MetricsRecord(R2). PR delagates
all method calls on it to R1 and R2. C1and C2 register JobTracker as the
updater.
Both C1 and C2 call JT.doUpdates at specified periods. The code flow for
doUpdates from C1 or C2:
1. Set/Incr methods on ProxyRecord are delegated to both R1 and R2. On the
JobTracker code these calls are
synchronized.
2. MetricsRecord.update() boils down to R1.update() and R2.update()
irrespective of whether it is from C1 or C2. This
call is not synchronized on JT.
Moreover, MetricsRecord javadoc clearly says:
<em>Different threads should *not* use the same MetricsRecord instance at the
same time. </em>. So, the problem here is that the ProxyRecord is shared
between two threads without synchronization. There is possibility for race if
the above updates are not synchronized. I think this could be the most likely
cause for seeing incorrect values with CompositeContext.
> Incorrect values for metrics with CompositeContext
> --------------------------------------------------
>
> Key: HADOOP-6508
> URL: https://issues.apache.org/jira/browse/HADOOP-6508
> Project: Hadoop Common
> Issue Type: Bug
> Components: metrics
> Affects Versions: 0.20.0
> Reporter: Amareshwari Sriramadasu
> Fix For: 0.22.0
>
> Attachments: CompositeContext.png
>
>
> In our clusters, when we use CompositeContext with two contexts, second
> context gets wrong values.
> This problem is consistent on 500 (and above) node cluster.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.