[ 
https://issues.apache.org/jira/browse/HADOOP-3748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626938#action_12626938
 ] 

Alejandro Abdelnur commented on HADOOP-3748:
--------------------------------------------

Our test was focused on seeing how jobs with a large number of counters can 
affect the cluster. 

Our observations indicate that number of network pkg_in traffic increases 
significantly and faster than linear (we are now testing on a 400 nodes to 
confirm this).

The test MR job itself was not doing any thing useful and we can claim 
performance improvement on job execution as our test map is coded to run for 30 
mins (write counters, sleep, ...).

The reasoning is that by lowering (significantly) the network traffic to the JT 
we are making things easier for it, besides less network overhead.

BTW, the path would be very simple, ~15 lines of code that just add an {{IF}} 
condition in a couple of places.

> Flag to make tasks to send counter information only at the end of the task
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-3748
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3748
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>         Environment: all
>            Reporter: Alejandro Abdelnur
>
> Currently counters are streaming from the task to the jobtracker as the task 
> progresses. If the number of counters is large this has a significant impact 
> on the network traffic as well as in the JobTracker load.
> The should be a flag, for example by counter-group, that indicates that the 
> counters are to be reported at the end of the task. By default this flag 
> should be set to false for all counter-groups maintaining the current 
> behavior.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to