[
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457082#comment-16457082
]
Vineet Garg commented on TEZ-3911:
----------------------------------
[~ashutoshc] I plan to add config in {{VertexImpl::constructStatistics}}. This
config will control {{aggregateAllCounters}} call. This patch doesn't yet
provide getMin/getMax apis to retrieve min/max on TezCounter.
bq. Also, there is no 'avg' aggregation. I think sum(counter)/(number of tasks)
as avg would also be useful.
Isn't this trivial to compute by whomever is using the APIs? The reason we are
baking in min/max is so that consumers like History Logging service wouldn't
have to loop over task's counters to do so. Let me know if you still think avg
would be useful. That API probably should be added separately on Dag level if
we decided to implement it cc [~ewohlstadter] [~gopalv]
> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at
> final counter aggr.
> -------------------------------------------------------------------------------------------------
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
> Issue Type: New Feature
> Reporter: Eric Wohlstadter
> Assignee: Vineet Garg
> Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required
> to compute any task-level aggregations other than "sum". This is inefficient
> as Tez is already "scanning" over this data. Computing incremental aggregates
> shouldn't require additional scans by ATS consumers.
> Provide an option for Task counter aggregations other than "sum". Computation
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS.
> Only incremental aggregations will be supported (min/max/avg). Aggregation
> computation will be folded into the existing "aggregation loop" beginning at
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be
> provided.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)