[
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457135#comment-16457135
]
Gopal V commented on TEZ-3911:
------------------------------
>From an API perspective, it would be nice to deprecate the incr call and use 1
>aggregate() call to do both min, max, count & sum in a single function (which
>is useful if it is synchronized for some reason, fewer locks in total).
This would mean incrAllCounters would call aggregateAllCounters and do the same
thing in both scenarios.
The lowest level counter does not need the min-max etc (because 1 task has many
incr calls and the extra overhead is not useful).
The aggregated counters are only useful for a higher level counter - so a new
class like AbstractAggregatedCounter might be a good way to design that into
only the AM generated counters (& the task-side counters in the Hive operator
tree basically don't have range calculations).
> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at
> final counter aggr.
> -------------------------------------------------------------------------------------------------
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
> Issue Type: New Feature
> Reporter: Eric Wohlstadter
> Assignee: Vineet Garg
> Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required
> to compute any task-level aggregations other than "sum". This is inefficient
> as Tez is already "scanning" over this data. Computing incremental aggregates
> shouldn't require additional scans by ATS consumers.
> Provide an option for Task counter aggregations other than "sum". Computation
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS.
> Only incremental aggregations will be supported (min/max/avg). Aggregation
> computation will be folded into the existing "aggregation loop" beginning at
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be
> provided.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)