[
https://issues.apache.org/jira/browse/HADOOP-3748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626448#action_12626448
]
Alejandro Abdelnur commented on HADOOP-3748:
--------------------------------------------
For 200 nodes clusters, without counters traffic the peak for pkg_in was 900 in
the JT box, with 200 counters the peak for pkg_in was 9000.
We did not see any significant CPU increase in the JT. But my thinking is that
this is extra handling should create contention in some pieces of the JT.
For smaller clusters it was less. It seems it non-linear, the bigger the
cluster the more pkg_in in the JT box.
Our job tests were doing nothing just idling and incrementing counters. I'm
thinking on the overheard in the JT to handle those partial counters. Other
patches as trying to move work away from the JT to the TT or even the tasks (ie
committing a task ouput). In my eyes this patch is to along the same lines,
remove as much load as possible from the JT.
> Flag to make tasks to send counter information only at the end of the task
> --------------------------------------------------------------------------
>
> Key: HADOOP-3748
> URL: https://issues.apache.org/jira/browse/HADOOP-3748
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Environment: all
> Reporter: Alejandro Abdelnur
>
> Currently counters are streaming from the task to the jobtracker as the task
> progresses. If the number of counters is large this has a significant impact
> on the network traffic as well as in the JobTracker load.
> The should be a flag, for example by counter-group, that indicates that the
> counters are to be reported at the end of the task. By default this flag
> should be set to false for all counter-groups maintaining the current
> behavior.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.