[ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457093#comment-16457093
 ] 

Eric Wohlstadter commented on TEZ-3911:
---------------------------------------

I agree, the most important thing is removing any requirement for consumers to 
scan all Task Counters to compute aggregates. 
Simple convenience features (like an average that can be computed from two 
other values) are just a nice to have. 

> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -------------------------------------------------------------------------------------------------
>
>                 Key: TEZ-3911
>                 URL: https://issues.apache.org/jira/browse/TEZ-3911
>             Project: Apache Tez
>          Issue Type: New Feature
>            Reporter: Eric Wohlstadter
>            Assignee: Vineet Garg
>            Priority: Critical
>             Fix For: 0.9.next
>
>         Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to