[
https://issues.apache.org/jira/browse/HADOOP-3748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626938#action_12626938
]
Alejandro Abdelnur commented on HADOOP-3748:
--------------------------------------------
Our test was focused on seeing how jobs with a large number of counters can
affect the cluster.
Our observations indicate that number of network pkg_in traffic increases
significantly and faster than linear (we are now testing on a 400 nodes to
confirm this).
The test MR job itself was not doing any thing useful and we can claim
performance improvement on job execution as our test map is coded to run for 30
mins (write counters, sleep, ...).
The reasoning is that by lowering (significantly) the network traffic to the JT
we are making things easier for it, besides less network overhead.
BTW, the path would be very simple, ~15 lines of code that just add an {{IF}}
condition in a couple of places.
> Flag to make tasks to send counter information only at the end of the task
> --------------------------------------------------------------------------
>
> Key: HADOOP-3748
> URL: https://issues.apache.org/jira/browse/HADOOP-3748
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Environment: all
> Reporter: Alejandro Abdelnur
>
> Currently counters are streaming from the task to the jobtracker as the task
> progresses. If the number of counters is large this has a significant impact
> on the network traffic as well as in the JobTracker load.
> The should be a flag, for example by counter-group, that indicates that the
> counters are to be reported at the end of the task. By default this flag
> should be set to false for all counter-groups maintaining the current
> behavior.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.