[ 
https://issues.apache.org/jira/browse/HADOOP-6508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805355#action_12805355
 ] 

Amareshwari Sriramadasu commented on HADOOP-6508:
-------------------------------------------------

Analyzing the following JobTrackerMetricsInst code  with CompositeContext as 
the MetricsContext:
{code}
    MetricsContext context = MetricsUtil.getContext("mapred");
    metricsRecord = MetricsUtil.createRecord(context, "jobtracker");
    metricsRecord.setTag("sessionId", sessionId);
    context.registerUpdater(this);
{code}

Details on each line of code:
{code}
    MetricsContext context = MetricsUtil.getContext("mapred");
{code}
This code creates a CompositeContext(CC), which creates all its sub-contexts 
and calls startsMonitoring on all the
subcontext. Thus there are as many threads(monitoring) as the number of 
sub-contexts. Here, each thread calls
doUpdates() followed by emitRecords() in the configured periods.

{code}
    metricsRecord = MetricsUtil.createRecord(context, "jobtracker");
{code}
This code creates a MetricsRecord for CompositeContext, which is a Proxy which 
has a delegator for all the sub-records. This record invokes
every method call on all its sub-records.

{code}
    context.registerUpdater(this);
{code}
This code registers JobTracker as the updater for all the sub-contexts.

Putting above relation pictorially (see the attached png): JobTracker (JT) has 
CompositeContext(CC) and ProxyMetricsRecord (PR). CC starts
Context1(C1) and Context2(C2) 's timer threads. C1 has MetricsRecord (R1) and 
C2 has MetricsRecord(R2). PR delagates
all method calls on it to R1 and R2. C1and C2 register JobTracker as the 
updater.

Both C1 and C2 call JT.doUpdates at specified periods. The code flow for 
doUpdates from C1 or C2:
  1. Set/Incr methods on ProxyRecord are delegated to both R1 and R2. On the 
JobTracker code these calls are
synchronized.
  2. MetricsRecord.update() boils down to R1.update() and R2.update() 
irrespective of whether it is from C1 or C2. This
call is not synchronized on JT.

Moreover, MetricsRecord javadoc clearly says: 
<em>Different threads should  *not* use the same MetricsRecord instance at the 
same time. </em>.  So, the problem here is that the ProxyRecord is shared 
between two threads without synchronization. There is possibility for race if 
the above updates are not synchronized. I think this could be the most likely 
cause for seeing incorrect values with CompositeContext.


> Incorrect values for metrics with CompositeContext
> --------------------------------------------------
>
>                 Key: HADOOP-6508
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6508
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 0.20.0
>            Reporter: Amareshwari Sriramadasu
>             Fix For: 0.22.0
>
>         Attachments: CompositeContext.png
>
>
> In our clusters, when we use CompositeContext with two contexts, second 
> context gets wrong values.
> This problem is consistent on 500 (and above) node cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to