[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.

2018-05-08 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16468102#comment-16468102
 ] 

TezQA commented on TEZ-3911:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12922532/TEZ-3911.007.patch
  against master revision 081a64f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 27 javac 
compiler warnings (more than the master's current 24 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2795//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2795//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2795//console

This message is automatically generated.


> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Eric Wohlstadter
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch, 
> TEZ-3911.003.patch, TEZ-3911.004.patch, TEZ-3911.005.patch, 
> TEZ-3911.006.patch, TEZ-3911.007.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.

2018-05-08 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467994#comment-16467994
 ] 

Vineet Garg commented on TEZ-3911:
--

Latest patch (007) addresses review comment.

> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Eric Wohlstadter
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch, 
> TEZ-3911.003.patch, TEZ-3911.004.patch, TEZ-3911.005.patch, 
> TEZ-3911.006.patch, TEZ-3911.007.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.

2018-05-08 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467835#comment-16467835
 ] 

Gopal V commented on TEZ-3911:
--

The code LGTM - +1 pending change to test to generate a diff min/max for vertex 
A and vertex B.

{code}
+Assert.assertEquals(1, 
((AggregateTezCounterDelegate)vBCounters.findCounter(globalCounterName, 
globalCounterName)).getMin());
+Assert.assertEquals(1, 
((AggregateTezCounterDelegate)vBCounters.findCounter(globalCounterName, 
globalCounterName)).getMax());
{code}

so that you would get 1,2 there instead of 1,1

> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Eric Wohlstadter
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch, 
> TEZ-3911.003.patch, TEZ-3911.004.patch, TEZ-3911.005.patch, TEZ-3911.006.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.

2018-05-07 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466894#comment-16466894
 ] 

Ashutosh Chauhan commented on TEZ-3911:
---

[~ewohlstadter] [~gopalv] Can you guys please review?

> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Eric Wohlstadter
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch, 
> TEZ-3911.003.patch, TEZ-3911.004.patch, TEZ-3911.005.patch, TEZ-3911.006.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.

2018-05-07 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466652#comment-16466652
 ] 

TezQA commented on TEZ-3911:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12922356/TEZ-3911.005.patch
  against master revision bb2c42b.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 27 javac 
compiler warnings (more than the master's current 24 warnings).

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 1 
warning messages.
See 
https://builds.apache.org/job/PreCommit-TEZ-Build/2792//artifact/patchprocess/diffJavadocWarnings.txt
 for details.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 4 
release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.test.TestAMRecovery

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2792//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2792//artifact/patchprocess/patchReleaseAuditProblems.txt
Javac warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2792//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2792//console

This message is automatically generated.


> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Eric Wohlstadter
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch, 
> TEZ-3911.003.patch, TEZ-3911.004.patch, TEZ-3911.005.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.

2018-05-04 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16464218#comment-16464218
 ] 

Vineet Garg commented on TEZ-3911:
--

[~gopalv] Did you get a chance to take a look at it? Can you see if my approach 
is correct or does it need change?

> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Eric Wohlstadter
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch, 
> TEZ-3911.003.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.

2018-05-02 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461780#comment-16461780
 ] 

Gopal V commented on TEZ-3911:
--

Added to my review queue, thanks [~vgarg]

> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Eric Wohlstadter
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch, 
> TEZ-3911.003.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.

2018-05-02 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461760#comment-16461760
 ] 

Vineet Garg commented on TEZ-3911:
--

[~ewohlstadter] That was my mistake initializing it to 0. I have fixed that in 
the latest patch. Thanks for pointing it out.
Latest patch(3) adds config flag as well as apis to retrieve min/max. [~gopalv] 
suggested to not use config flag and enable new aggregations by default. 
[~gopalv] Can you expand on your second comment about adding extra abstract 
class to handle this. It sounds like my approach might not be correct since. 
Please take a look at the latest patch I uploaded.


> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Eric Wohlstadter
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch, 
> TEZ-3911.003.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.

2018-04-30 Thread Eric Wohlstadter (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16459048#comment-16459048
 ] 

Eric Wohlstadter commented on TEZ-3911:
---

[~vgarg]

Can you briefly explain how bootstrapping the minimum aggregation works?

Since minValue is initialized to 0, I didn't see how any of the task counters 
are going to be less than the initial value. 

> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Eric Wohlstadter
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.

2018-04-27 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457135#comment-16457135
 ] 

Gopal V commented on TEZ-3911:
--

>From an API perspective, it would be nice to deprecate the incr call and use 1 
>aggregate() call to do both min, max, count & sum in a single function (which 
>is useful if it is synchronized for some reason, fewer locks in total).

This would mean incrAllCounters would call aggregateAllCounters and do the same 
thing in both scenarios.

The lowest level counter does not need the min-max etc (because 1 task has many 
incr calls and the extra overhead is not useful).

The aggregated counters are only useful for a higher level counter - so a new 
class like AbstractAggregatedCounter might be a good way to design that into 
only the AM generated counters (& the task-side counters in the Hive operator 
tree basically don't have range calculations).

> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Eric Wohlstadter
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.

2018-04-27 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457126#comment-16457126
 ] 

Gopal V commented on TEZ-3911:
--

The implementation should have a default impl, instead of forcing API to do 
something strange.

The AbstractCounter can add a default implementation, to prevent API breakage 
for code which inherits the base impl from Tez.

Also sync nit on 

{code}
+  @Override
+  public void aggregate(long val) {
+if(val< this.minVal.get()) {
+  this.minVal.set(val);
+}
+if(val > this.maxVal.get()) {
+  this.maxVal.set(val);
+}
+  }
+
{code}

with atomic longs.

> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Eric Wohlstadter
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.

2018-04-27 Thread Eric Wohlstadter (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457093#comment-16457093
 ] 

Eric Wohlstadter commented on TEZ-3911:
---

I agree, the most important thing is removing any requirement for consumers to 
scan all Task Counters to compute aggregates. 
Simple convenience features (like an average that can be computed from two 
other values) are just a nice to have. 

> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Eric Wohlstadter
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.

2018-04-27 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457082#comment-16457082
 ] 

Vineet Garg commented on TEZ-3911:
--

[~ashutoshc] I plan to add config in {{VertexImpl::constructStatistics}}. This 
config will control {{aggregateAllCounters}} call. This patch doesn't yet 
provide getMin/getMax apis to retrieve min/max on TezCounter. 

bq. Also, there is no 'avg' aggregation. I think sum(counter)/(number of tasks) 
as avg would also be useful.
Isn't this trivial to compute by whomever is using the APIs? The reason we are 
baking in min/max is so that consumers like History Logging service wouldn't 
have to loop over task's counters to do so.  Let me know if you still think avg 
would be useful. That API probably should be added separately on Dag level if 
we decided to implement it cc [~ewohlstadter] [~gopalv]

> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Eric Wohlstadter
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.

2018-04-27 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16456878#comment-16456878
 ] 

Ashutosh Chauhan commented on TEZ-3911:
---

In general patch seems to me in right direction. I am not sure where will 
config be? Seems patch is anyway adding new method to do the aggregation and 
retrieve them. May be I am missing something. Can you upload full patch?
Also, there is no 'avg' aggregation. I think sum(counter)/(number of tasks) as 
avg would also be useful.

> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Eric Wohlstadter
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.

2018-04-26 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16455688#comment-16455688
 ] 

TezQA commented on TEZ-3911:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12920925/TEZ-3911.002.patch
  against master revision 2e66f3c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2776//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2776//console

This message is automatically generated.


> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Eric Wohlstadter
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.

2018-04-26 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16455537#comment-16455537
 ] 

TezQA commented on TEZ-3911:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12920726/TEZ-3911.001.patch
  against master revision 2e66f3c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2775//console

This message is automatically generated.


> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Eric Wohlstadter
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.

2018-04-26 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16455518#comment-16455518
 ] 

Vineet Garg commented on TEZ-3911:
--

Thanks [~jeagles]

> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Eric Wohlstadter
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.

2018-04-26 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16455516#comment-16455516
 ] 

Jonathan Eagles commented on TEZ-3911:
--

[~vgarg], you should have the permissions to assign jiras to yourself now.

> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Eric Wohlstadter
>Assignee: Eric Wohlstadter
>Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.

2018-04-25 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453475#comment-16453475
 ] 

Vineet Garg commented on TEZ-3911:
--

[~ewohlstadter] Can you assign this to me? I am unable to edit this jira. Looks 
like I don't have permissions.

> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Eric Wohlstadter
>Assignee: Eric Wohlstadter
>Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.

2018-04-25 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453474#comment-16453474
 ] 

Vineet Garg commented on TEZ-3911:
--

Attaching initial patch to get test run.

> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Eric Wohlstadter
>Assignee: Eric Wohlstadter
>Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.

2018-04-25 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453470#comment-16453470
 ] 

Vineet Garg commented on TEZ-3911:
--

[~ewohlstadter] [~t3rmin4t0r] Can you take a look at my first attempt for this 
at https://github.com/apache/tez/compare/master...vineetgarg02:TEZ-3911 and 
provide feedback? I would like to know if I am going at right track. This 
implementation has yet to implement the following:
* Config flag to control extra aggregation (min/max)
* Test coverage for new aggregation
* methods to retrieve min/max from counters.

Looking forward to your feedback.

> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Eric Wohlstadter
>Assignee: Eric Wohlstadter
>Priority: Critical
> Fix For: 0.9.next
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)