[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.

2018-05-08 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16468102#comment-16468102
 ] 

TezQA commented on TEZ-3911:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12922532/TEZ-3911.007.patch
  against master revision 081a64f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 27 javac 
compiler warnings (more than the master's current 24 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2795//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2795//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2795//console

This message is automatically generated.


> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Eric Wohlstadter
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch, 
> TEZ-3911.003.patch, TEZ-3911.004.patch, TEZ-3911.005.patch, 
> TEZ-3911.006.patch, TEZ-3911.007.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Failed: TEZ-3911 PreCommit Build #2795

2018-05-08 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-3911
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2795/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 353.48 KB...]
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 01:01 h
[INFO] Finished at: 2018-05-08T23:23:09Z
[INFO] Final Memory: 83M/1360M
[INFO] 




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12922532/TEZ-3911.007.patch
  against master revision 081a64f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 27 javac 
compiler warnings (more than the master's current 24 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2795//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2795//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2795//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==




==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
[Fast Archiver] Compressed 3.61 MB of artifacts by 26.8% relative to #2791
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
All tests passed

Failed: TEZ-3933 PreCommit Build #2796

2018-05-08 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-3933
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2796/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 342.92 KB...]
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] [Help 2] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :tez-dag
[INFO] Build failures were ignored.




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12922528/TEZ-3933.001.patch
  against master revision 081a64f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.dag.app.TestSpeculation

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2796//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2796//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==




==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
[Fast Archiver] Compressed 3.59 MB of artifacts by 12.2% relative to #2791
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
1 tests failed.
FAILED:  org.apache.tez.dag.app.TestSpeculation.testBasicSpeculationWithProgress

Error Message:
expected:<2> but was:<3>

Stack Trace:
java.lang.AssertionError: expected:<2> but was:<3>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at 
org.apache.tez.dag.app.TestSpeculation.testBasicSpeculation(TestSpeculation.java:172)
at 
org.apache.tez.dag.app.TestSpeculation.testBasicSpeculationWithProgress(TestSpeculation.java:193)

[jira] [Commented] (TEZ-3933) Remove sleep from test TestExceptionPropagation

2018-05-08 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16468096#comment-16468096
 ] 

TezQA commented on TEZ-3933:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12922528/TEZ-3933.001.patch
  against master revision 081a64f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.dag.app.TestSpeculation

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2796//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2796//console

This message is automatically generated.


> Remove sleep from test TestExceptionPropagation
> ---
>
> Key: TEZ-3933
> URL: https://issues.apache.org/jira/browse/TEZ-3933
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
>Priority: Major
> Attachments: TEZ-3933.001.patch
>
>
> While investigating TEZ-3932, it was found that the test suite takes nearly 2 
> minutes to run. After removing the sleep, test suite now takes 40 seconds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3932) TaskSchedulerManager can throw NullPointerException during DAGAppMaster container cleanup race

2018-05-08 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16468042#comment-16468042
 ] 

Jonathan Eagles commented on TEZ-3932:
--

[~jlowe], can you have a look at this NullPointerProtection patch? Test failure 
is due to an unrelated timeout (probably should be bumped higher)

> TaskSchedulerManager can throw NullPointerException during DAGAppMaster 
> container cleanup race
> --
>
> Key: TEZ-3932
> URL: https://issues.apache.org/jira/browse/TEZ-3932
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.10.0
> Environment: arch: x86 and ppc
> java: openjdk version "1.8.0_161"
>  OpenJDK Runtime Environment (build 1.8.0_161-b14)
>  OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode)
>Reporter: Valencia Edna Serrao
>Assignee: Jonathan Eagles
>Priority: Major
>  Labels: ppc, x86
> Attachments: TEZ-3932.001.patch, TEZ-3932.fail.patch, 
> org.apache.tez.test.TestExceptionPropagation-output.txt
>
>
> Test 
> org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession 
> on x86 and ppc. I found related JIRA's TEZ-3746 and TEZ-3748. Though the 
> issue is marked as resolved in the related JIRA's, the issue exists. Below 
> are the error details:
> {code:java}
> ---
> Test set: org.apache.tez.test.TestExceptionPropagation
> ---
> Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 96.433 sec 
> <<< FAILURE!
> testExceptionPropagationSession(org.apache.tez.test.TestExceptionPropagation) 
>  Time elapsed: 52.7 sec  <<< ERROR!
> org.apache.tez.dag.api.SessionNotRunning: Application not running, 
> applicationId=application_1525667420557_0001, yarnApplicationState=FAILED, 
> finalApplicationStatus=FAILED, trackingUrl=N/A, diagnostics=[DAG completed 
> with an ERROR state. Shutting down AM, Session stats:submittedDAGs=11, 
> successfulDAGs=0, failedDAGs=12, killedDAGs=0]
>     at 
> org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910)
>     at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1024)
>     at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:1034)
>     at 
> org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:652)
>     at org.apache.tez.client.TezClient.submitDAG(TezClient.java:588)
>     at 
> org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession(TestExceptionPropagation.java:227
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3932) TaskSchedulerManager can throw NullPointerException during DAGAppMaster container cleanup race

2018-05-08 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16468033#comment-16468033
 ] 

TezQA commented on TEZ-3932:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12922523/TEZ-3932.001.patch
  against master revision 081a64f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   
org.apache.tez.runtime.library.conf.TestUnorderedPartitionedKVEdgeConfig

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2794//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2794//console

This message is automatically generated.


> TaskSchedulerManager can throw NullPointerException during DAGAppMaster 
> container cleanup race
> --
>
> Key: TEZ-3932
> URL: https://issues.apache.org/jira/browse/TEZ-3932
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.10.0
> Environment: arch: x86 and ppc
> java: openjdk version "1.8.0_161"
>  OpenJDK Runtime Environment (build 1.8.0_161-b14)
>  OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode)
>Reporter: Valencia Edna Serrao
>Assignee: Jonathan Eagles
>Priority: Major
>  Labels: ppc, x86
> Attachments: TEZ-3932.001.patch, TEZ-3932.fail.patch, 
> org.apache.tez.test.TestExceptionPropagation-output.txt
>
>
> Test 
> org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession 
> on x86 and ppc. I found related JIRA's TEZ-3746 and TEZ-3748. Though the 
> issue is marked as resolved in the related JIRA's, the issue exists. Below 
> are the error details:
> {code:java}
> ---
> Test set: org.apache.tez.test.TestExceptionPropagation
> ---
> Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 96.433 sec 
> <<< FAILURE!
> testExceptionPropagationSession(org.apache.tez.test.TestExceptionPropagation) 
>  Time elapsed: 52.7 sec  <<< ERROR!
> org.apache.tez.dag.api.SessionNotRunning: Application not running, 
> applicationId=application_1525667420557_0001, yarnApplicationState=FAILED, 
> finalApplicationStatus=FAILED, trackingUrl=N/A, diagnostics=[DAG completed 
> with an ERROR state. Shutting down AM, Session stats:submittedDAGs=11, 
> successfulDAGs=0, failedDAGs=12, killedDAGs=0]
>     at 
> org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910)
>     at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1024)
>     at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:1034)
>     at 
> org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:652)
>     at org.apache.tez.client.TezClient.submitDAG(TezClient.java:588)
>     at 
> org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession(TestExceptionPropagation.java:227
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Failed: TEZ-3932 PreCommit Build #2794

2018-05-08 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-3932
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2794/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 342.77 KB...]
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] [Help 2] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :tez-runtime-library
[INFO] Build failures were ignored.




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12922523/TEZ-3932.001.patch
  against master revision 081a64f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   
org.apache.tez.runtime.library.conf.TestUnorderedPartitionedKVEdgeConfig

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2794//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2794//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==




==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
[Fast Archiver] Compressed 3.59 MB of artifacts by 10.4% relative to #2791
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
1 tests failed.
FAILED:  
org.apache.tez.runtime.library.conf.TestUnorderedPartitionedKVEdgeConfig.testDefaultConfigsUsed

Error Message:
test timed out after 2000 milliseconds

Stack Trace:
java.lang.Exception: test timed out after 2000 milliseconds
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:255)
at sun.misc.Resource.getBytes(Resource.java:124)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:462)
at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at 
org.apache.tez.runtime.library.conf.UnorderedPartitionedKVEdgeConfig$Builder.(UnorderedPartitionedKVEdgeConfig.java:170)
at 
org.apache.tez.runtime.library.conf.UnorderedPartitionedKVEdgeConfig.newBuilder(UnorderedPartitionedKVEdgeConfig.java:79)
at 
org.apache.tez.runtime.library.conf.UnorderedPartitionedKVEdgeConfig.newBuilder(UnorderedPartitionedKVEdgeConfig.java:92)
at 
org.apache.tez.runtime.library.conf.TestUnorderedPartitionedKVEdgeConfig.testDefaultConfigsUsed(TestUnorderedPartitionedKVEdgeConfig.java:69)

[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.

2018-05-08 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467994#comment-16467994
 ] 

Vineet Garg commented on TEZ-3911:
--

Latest patch (007) addresses review comment.

> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Eric Wohlstadter
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch, 
> TEZ-3911.003.patch, TEZ-3911.004.patch, TEZ-3911.005.patch, 
> TEZ-3911.006.patch, TEZ-3911.007.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.

2018-05-08 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated TEZ-3911:
-
Attachment: TEZ-3911.007.patch

> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Eric Wohlstadter
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch, 
> TEZ-3911.003.patch, TEZ-3911.004.patch, TEZ-3911.005.patch, 
> TEZ-3911.006.patch, TEZ-3911.007.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (TEZ-3933) Remove sleep from test TestExceptionPropagation

2018-05-08 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated TEZ-3933:
-
Description: While investigating TEZ-3932, it was found that the test suite 
takes nearly 2 minutes to run. After removing the sleep, test suite now takes 
40 seconds.

> Remove sleep from test TestExceptionPropagation
> ---
>
> Key: TEZ-3933
> URL: https://issues.apache.org/jira/browse/TEZ-3933
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
>Priority: Major
> Attachments: TEZ-3933.001.patch
>
>
> While investigating TEZ-3932, it was found that the test suite takes nearly 2 
> minutes to run. After removing the sleep, test suite now takes 40 seconds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (TEZ-3933) Remove sleep from test TestExceptionPropagation

2018-05-08 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated TEZ-3933:
-
Attachment: TEZ-3933.001.patch

> Remove sleep from test TestExceptionPropagation
> ---
>
> Key: TEZ-3933
> URL: https://issues.apache.org/jira/browse/TEZ-3933
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Priority: Major
> Attachments: TEZ-3933.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (TEZ-3933) Remove sleep from test TestExceptionPropagation

2018-05-08 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles reassigned TEZ-3933:


Assignee: Jonathan Eagles

> Remove sleep from test TestExceptionPropagation
> ---
>
> Key: TEZ-3933
> URL: https://issues.apache.org/jira/browse/TEZ-3933
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
>Priority: Major
> Attachments: TEZ-3933.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (TEZ-3933) Remove sleep from test TestExceptionPropagation

2018-05-08 Thread Jonathan Eagles (JIRA)
Jonathan Eagles created TEZ-3933:


 Summary: Remove sleep from test TestExceptionPropagation
 Key: TEZ-3933
 URL: https://issues.apache.org/jira/browse/TEZ-3933
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jonathan Eagles






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3932) TaskSchedulerManager can throw NullPointerException during DAGAppMaster container cleanup race

2018-05-08 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467951#comment-16467951
 ] 

Jonathan Eagles commented on TEZ-3932:
--

[~vserrao], thank you for providing the test logs as I was able to create a 
reliable test case that reproduces this issue. I was able to create an initial 
patch that will remove this intermittent issue you have been facing and I will 
work with the community to get this checked in. This logs show that this is not 
just a test issue but could happen in practice during shutdown scenarios. 

> TaskSchedulerManager can throw NullPointerException during DAGAppMaster 
> container cleanup race
> --
>
> Key: TEZ-3932
> URL: https://issues.apache.org/jira/browse/TEZ-3932
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.10.0
> Environment: arch: x86 and ppc
> java: openjdk version "1.8.0_161"
>  OpenJDK Runtime Environment (build 1.8.0_161-b14)
>  OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode)
>Reporter: Valencia Edna Serrao
>Assignee: Jonathan Eagles
>Priority: Major
>  Labels: ppc, x86
> Attachments: TEZ-3932.001.patch, TEZ-3932.fail.patch, 
> org.apache.tez.test.TestExceptionPropagation-output.txt
>
>
> Test 
> org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession 
> on x86 and ppc. I found related JIRA's TEZ-3746 and TEZ-3748. Though the 
> issue is marked as resolved in the related JIRA's, the issue exists. Below 
> are the error details:
> {code:java}
> ---
> Test set: org.apache.tez.test.TestExceptionPropagation
> ---
> Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 96.433 sec 
> <<< FAILURE!
> testExceptionPropagationSession(org.apache.tez.test.TestExceptionPropagation) 
>  Time elapsed: 52.7 sec  <<< ERROR!
> org.apache.tez.dag.api.SessionNotRunning: Application not running, 
> applicationId=application_1525667420557_0001, yarnApplicationState=FAILED, 
> finalApplicationStatus=FAILED, trackingUrl=N/A, diagnostics=[DAG completed 
> with an ERROR state. Shutting down AM, Session stats:submittedDAGs=11, 
> successfulDAGs=0, failedDAGs=12, killedDAGs=0]
>     at 
> org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910)
>     at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1024)
>     at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:1034)
>     at 
> org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:652)
>     at org.apache.tez.client.TezClient.submitDAG(TezClient.java:588)
>     at 
> org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession(TestExceptionPropagation.java:227
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (TEZ-3932) TaskSchedulerManager can throw NullPointerException during DAGAppMaster container cleanup race

2018-05-08 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated TEZ-3932:
-
Summary: TaskSchedulerManager can throw NullPointerException during 
DAGAppMaster container cleanup race  (was: 
TestExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc)

> TaskSchedulerManager can throw NullPointerException during DAGAppMaster 
> container cleanup race
> --
>
> Key: TEZ-3932
> URL: https://issues.apache.org/jira/browse/TEZ-3932
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.10.0
> Environment: arch: x86 and ppc
> java: openjdk version "1.8.0_161"
>  OpenJDK Runtime Environment (build 1.8.0_161-b14)
>  OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode)
>Reporter: Valencia Edna Serrao
>Priority: Major
>  Labels: ppc, x86
> Attachments: TEZ-3932.001.patch, TEZ-3932.fail.patch, 
> org.apache.tez.test.TestExceptionPropagation-output.txt
>
>
> Test 
> org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession 
> on x86 and ppc. I found related JIRA's TEZ-3746 and TEZ-3748. Though the 
> issue is marked as resolved in the related JIRA's, the issue exists. Below 
> are the error details:
> {code:java}
> ---
> Test set: org.apache.tez.test.TestExceptionPropagation
> ---
> Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 96.433 sec 
> <<< FAILURE!
> testExceptionPropagationSession(org.apache.tez.test.TestExceptionPropagation) 
>  Time elapsed: 52.7 sec  <<< ERROR!
> org.apache.tez.dag.api.SessionNotRunning: Application not running, 
> applicationId=application_1525667420557_0001, yarnApplicationState=FAILED, 
> finalApplicationStatus=FAILED, trackingUrl=N/A, diagnostics=[DAG completed 
> with an ERROR state. Shutting down AM, Session stats:submittedDAGs=11, 
> successfulDAGs=0, failedDAGs=12, killedDAGs=0]
>     at 
> org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910)
>     at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1024)
>     at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:1034)
>     at 
> org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:652)
>     at org.apache.tez.client.TezClient.submitDAG(TezClient.java:588)
>     at 
> org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession(TestExceptionPropagation.java:227
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (TEZ-3932) TaskSchedulerManager can throw NullPointerException during DAGAppMaster container cleanup race

2018-05-08 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles reassigned TEZ-3932:


Assignee: Jonathan Eagles

> TaskSchedulerManager can throw NullPointerException during DAGAppMaster 
> container cleanup race
> --
>
> Key: TEZ-3932
> URL: https://issues.apache.org/jira/browse/TEZ-3932
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.10.0
> Environment: arch: x86 and ppc
> java: openjdk version "1.8.0_161"
>  OpenJDK Runtime Environment (build 1.8.0_161-b14)
>  OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode)
>Reporter: Valencia Edna Serrao
>Assignee: Jonathan Eagles
>Priority: Major
>  Labels: ppc, x86
> Attachments: TEZ-3932.001.patch, TEZ-3932.fail.patch, 
> org.apache.tez.test.TestExceptionPropagation-output.txt
>
>
> Test 
> org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession 
> on x86 and ppc. I found related JIRA's TEZ-3746 and TEZ-3748. Though the 
> issue is marked as resolved in the related JIRA's, the issue exists. Below 
> are the error details:
> {code:java}
> ---
> Test set: org.apache.tez.test.TestExceptionPropagation
> ---
> Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 96.433 sec 
> <<< FAILURE!
> testExceptionPropagationSession(org.apache.tez.test.TestExceptionPropagation) 
>  Time elapsed: 52.7 sec  <<< ERROR!
> org.apache.tez.dag.api.SessionNotRunning: Application not running, 
> applicationId=application_1525667420557_0001, yarnApplicationState=FAILED, 
> finalApplicationStatus=FAILED, trackingUrl=N/A, diagnostics=[DAG completed 
> with an ERROR state. Shutting down AM, Session stats:submittedDAGs=11, 
> successfulDAGs=0, failedDAGs=12, killedDAGs=0]
>     at 
> org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910)
>     at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1024)
>     at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:1034)
>     at 
> org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:652)
>     at org.apache.tez.client.TezClient.submitDAG(TezClient.java:588)
>     at 
> org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession(TestExceptionPropagation.java:227
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (TEZ-3932) TestExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc

2018-05-08 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated TEZ-3932:
-
Attachment: TEZ-3932.001.patch

> TestExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc
> -
>
> Key: TEZ-3932
> URL: https://issues.apache.org/jira/browse/TEZ-3932
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.10.0
> Environment: arch: x86 and ppc
> java: openjdk version "1.8.0_161"
>  OpenJDK Runtime Environment (build 1.8.0_161-b14)
>  OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode)
>Reporter: Valencia Edna Serrao
>Priority: Major
>  Labels: ppc, x86
> Attachments: TEZ-3932.001.patch, TEZ-3932.fail.patch, 
> org.apache.tez.test.TestExceptionPropagation-output.txt
>
>
> Test 
> org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession 
> on x86 and ppc. I found related JIRA's TEZ-3746 and TEZ-3748. Though the 
> issue is marked as resolved in the related JIRA's, the issue exists. Below 
> are the error details:
> {code:java}
> ---
> Test set: org.apache.tez.test.TestExceptionPropagation
> ---
> Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 96.433 sec 
> <<< FAILURE!
> testExceptionPropagationSession(org.apache.tez.test.TestExceptionPropagation) 
>  Time elapsed: 52.7 sec  <<< ERROR!
> org.apache.tez.dag.api.SessionNotRunning: Application not running, 
> applicationId=application_1525667420557_0001, yarnApplicationState=FAILED, 
> finalApplicationStatus=FAILED, trackingUrl=N/A, diagnostics=[DAG completed 
> with an ERROR state. Shutting down AM, Session stats:submittedDAGs=11, 
> successfulDAGs=0, failedDAGs=12, killedDAGs=0]
>     at 
> org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910)
>     at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1024)
>     at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:1034)
>     at 
> org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:652)
>     at org.apache.tez.client.TezClient.submitDAG(TezClient.java:588)
>     at 
> org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession(TestExceptionPropagation.java:227
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.

2018-05-08 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467835#comment-16467835
 ] 

Gopal V commented on TEZ-3911:
--

The code LGTM - +1 pending change to test to generate a diff min/max for vertex 
A and vertex B.

{code}
+Assert.assertEquals(1, 
((AggregateTezCounterDelegate)vBCounters.findCounter(globalCounterName, 
globalCounterName)).getMin());
+Assert.assertEquals(1, 
((AggregateTezCounterDelegate)vBCounters.findCounter(globalCounterName, 
globalCounterName)).getMax());
{code}

so that you would get 1,2 there instead of 1,1

> Optional min/max/avg aggr. task counters reported to HistoryLoggingService at 
> final counter aggr.
> -
>
> Key: TEZ-3911
> URL: https://issues.apache.org/jira/browse/TEZ-3911
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Eric Wohlstadter
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 0.9.next
>
> Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch, 
> TEZ-3911.003.patch, TEZ-3911.004.patch, TEZ-3911.005.patch, TEZ-3911.006.patch
>
>
> Consumers of HistoryLoggingService reported counters are currently required 
> to compute any task-level aggregations other than "sum". This is inefficient 
> as Tez is already "scanning" over this data. Computing incremental aggregates 
> shouldn't require additional scans by ATS consumers. 
> Provide an option for Task counter aggregations other than "sum". Computation 
> of these extra counters can be turned on/off.
> The option will generate "synthetic" counters at final aggregation time for 
> reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. 
> Only incremental aggregations will be supported (min/max/avg). Aggregation 
> computation will be folded into the existing "aggregation loop" beginning at 
> VertexImpl.incrTaskCounters.
> Extra aggregations will only be supported during final counter aggregation.
> Aggregations will only include the "bestAttempt" for each task.
> A design doc will be provided.
> Because final task aggregation holds a lock, a performance report will be 
> provided. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (TEZ-3932) TestExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc

2018-05-08 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated TEZ-3932:
-
Attachment: TEZ-3932.fail.patch

> TestExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc
> -
>
> Key: TEZ-3932
> URL: https://issues.apache.org/jira/browse/TEZ-3932
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.10.0
> Environment: arch: x86 and ppc
> java: openjdk version "1.8.0_161"
>  OpenJDK Runtime Environment (build 1.8.0_161-b14)
>  OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode)
>Reporter: Valencia Edna Serrao
>Priority: Major
>  Labels: ppc, x86
> Attachments: TEZ-3932.fail.patch, 
> org.apache.tez.test.TestExceptionPropagation-output.txt
>
>
> Test 
> org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession 
> on x86 and ppc. I found related JIRA's TEZ-3746 and TEZ-3748. Though the 
> issue is marked as resolved in the related JIRA's, the issue exists. Below 
> are the error details:
> {code:java}
> ---
> Test set: org.apache.tez.test.TestExceptionPropagation
> ---
> Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 96.433 sec 
> <<< FAILURE!
> testExceptionPropagationSession(org.apache.tez.test.TestExceptionPropagation) 
>  Time elapsed: 52.7 sec  <<< ERROR!
> org.apache.tez.dag.api.SessionNotRunning: Application not running, 
> applicationId=application_1525667420557_0001, yarnApplicationState=FAILED, 
> finalApplicationStatus=FAILED, trackingUrl=N/A, diagnostics=[DAG completed 
> with an ERROR state. Shutting down AM, Session stats:submittedDAGs=11, 
> successfulDAGs=0, failedDAGs=12, killedDAGs=0]
>     at 
> org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910)
>     at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1024)
>     at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:1034)
>     at 
> org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:652)
>     at org.apache.tez.client.TezClient.submitDAG(TezClient.java:588)
>     at 
> org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession(TestExceptionPropagation.java:227
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (TEZ-3932) TestExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc

2018-05-08 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated TEZ-3932:
-
Summary: TestExceptionPropagation#testExceptionPropagationSession fails on 
x86 and ppc  (was: estExceptionPropagation#testExceptionPropagationSession 
fails on x86 and ppc)

> TestExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc
> -
>
> Key: TEZ-3932
> URL: https://issues.apache.org/jira/browse/TEZ-3932
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.10.0
> Environment: arch: x86 and ppc
> java: openjdk version "1.8.0_161"
>  OpenJDK Runtime Environment (build 1.8.0_161-b14)
>  OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode)
>Reporter: Valencia Edna Serrao
>Priority: Major
>  Labels: ppc, x86
> Attachments: org.apache.tez.test.TestExceptionPropagation-output.txt
>
>
> Test 
> org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession 
> on x86 and ppc. I found related JIRA's TEZ-3746 and TEZ-3748. Though the 
> issue is marked as resolved in the related JIRA's, the issue exists. Below 
> are the error details:
> {code:java}
> ---
> Test set: org.apache.tez.test.TestExceptionPropagation
> ---
> Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 96.433 sec 
> <<< FAILURE!
> testExceptionPropagationSession(org.apache.tez.test.TestExceptionPropagation) 
>  Time elapsed: 52.7 sec  <<< ERROR!
> org.apache.tez.dag.api.SessionNotRunning: Application not running, 
> applicationId=application_1525667420557_0001, yarnApplicationState=FAILED, 
> finalApplicationStatus=FAILED, trackingUrl=N/A, diagnostics=[DAG completed 
> with an ERROR state. Shutting down AM, Session stats:submittedDAGs=11, 
> successfulDAGs=0, failedDAGs=12, killedDAGs=0]
>     at 
> org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910)
>     at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1024)
>     at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:1034)
>     at 
> org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:652)
>     at org.apache.tez.client.TezClient.submitDAG(TezClient.java:588)
>     at 
> org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession(TestExceptionPropagation.java:227
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3931) TestExternalTezServices fails on Hadoop3

2018-05-08 Thread Kuhu Shukla (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467585#comment-16467585
 ] 

Kuhu Shukla commented on TEZ-3931:
--

The patch looks good to me [~jeagles]! The -1 for no tests included is 
acceptable here. Committing this to master and branch-0.9 shortly.

> TestExternalTezServices fails on Hadoop3
> 
>
> Key: TEZ-3931
> URL: https://issues.apache.org/jira/browse/TEZ-3931
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
>Priority: Major
> Attachments: TEZ-3931.001.patch
>
>
> In addition, to a netty upgrade needed (TEZ-3902), the dependency for 
> hadoop-mapreduce-client-shuffle needs to be added explicitly.
> {noformat}
> org.apache.tez.tests.TestExternalTezServices.org.apache.tez.tests.TestExternalTezServices
> Failing for the past 1 build (Since Failed#2782 )
> Took 5.4 sec.
> Error Message
> org/apache/hadoop/mapred/ShuffleHandler
> Stacktrace
> java.lang.NoClassDefFoundError: org/apache/hadoop/mapred/ShuffleHandler
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.tez.test.MiniTezCluster.serviceInit(MiniTezCluster.java:185)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.tez.tests.ExternalTezServiceTestHelper.(ExternalTezServiceTestHelper.java:73)
>   at 
> org.apache.tez.tests.TestExternalTezServices.setup(TestExternalTezServices.java:76)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (TEZ-3931) TestExternalTezServices fails on Hadoop3

2018-05-08 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467501#comment-16467501
 ] 

Jonathan Eagles edited comment on TEZ-3931 at 5/8/18 2:50 PM:
--

The following command shows the patch fixes the above startup dependency issue.
{noformat}
mvn clean test -Dtest=TestExternalTezServices -Dhadoop.version=3.0.2 -Phadoop28 
-P-hadoop27 -pl '!tez-ui'
{noformat}

This still needs an improved fix from TEZ-3902 but the JIRA needs to be put in 
first.
{noformat}
WARNING: An exception was thrown by a user handler while handling an exception 
event ([id: 0x922dfc31, /172.130.98.95:60354 => /172.130.98.95:60248] 
EXCEPTION: java.lang.NoSuchMethodError: 
org.jboss.netty.handler.codec.http.HttpRequest.headers()Lorg/jboss/netty/handler/codec/http/HttpHeaders;)
java.lang.NoSuchMethodError: 
org.jboss.netty.handler.codec.http.HttpResponse.headers()Lorg/jboss/netty/handler/codec/http/HttpHeaders;
at 
org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendError(ShuffleHandler.java:1327)
at 
org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendError(ShuffleHandler.java:1321)
at 
org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendError(ShuffleHandler.java:1316)
at 
org.apache.hadoop.mapred.ShuffleHandler$Shuffle.exceptionCaught(ShuffleHandler.java:1366)
at 
org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142)
at 
org.jboss.netty.handler.codec.frame.FrameDecoder.exceptionCaught(FrameDecoder.java:377)
at 
org.jboss.netty.channel.Channels.fireExceptionCaught(Channels.java:525)
at 
org.jboss.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:48)
at 
org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142)
at 
org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:148)
at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at 
org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at 
org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at 
org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107)
at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{noformat}


was (Author: jeagles):
The following command shows the patch fixes the above startup dependency issue.
{noformat}
mvn clean test -Dtest=TestExternalTezServices -Dhadoop.version=3.0.2 -Phadoop28 
-P-hadoop27 -pl '!tez-ui'
{noformat}

This still needs an improved fix from TEZ-3902
{noformat}
WARNING: An exception was thrown by a user handler while handling an exception 
event ([id: 0x922dfc31, /172.130.98.95:60354 => /172.130.98.95:60248] 
EXCEPTION: java.lang.NoSuchMethodError: 
org.jboss.netty.handler.codec.http.HttpRequest.headers()Lorg/jboss/netty/handler/codec/http/HttpHeaders;)
java.lang.NoSuchMethodError: 
org.jboss.netty.handler.codec.http.HttpResponse.headers()Lorg/jboss/netty/handler/codec/http/HttpHeaders;
at 
org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendError(ShuffleHandler.java:1327)
at 
org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendError(ShuffleHandler.java:1321)
at 
org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendError(ShuffleHandler.java:1316)
at 
org.apache.hadoop.mapred.ShuffleHandler$Shuffle.exceptionCaught(ShuffleHandler.java:1366)
at 
org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142)
at 
org.jboss.netty.handler.codec.frame.FrameDecoder.exceptionCaught(FrameDecoder.java:377)
at 
org.jboss.netty.channel.Channels.fireExceptionCaught(Channels.java:525)
at 
org.jboss.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:48)
at 
org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142)
at 

[jira] [Commented] (TEZ-3931) TestExternalTezServices fails on Hadoop3

2018-05-08 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467501#comment-16467501
 ] 

Jonathan Eagles commented on TEZ-3931:
--

The following command shows the patch fixes the above startup dependency issue.
{noformat}
mvn clean test -Dtest=TestExternalTezServices -Dhadoop.version=3.0.2 -Phadoop28 
-P-hadoop27 -pl '!tez-ui'
{noformat}

This still needs an improved fix from TEZ-3902
{noformat}
WARNING: An exception was thrown by a user handler while handling an exception 
event ([id: 0x922dfc31, /172.130.98.95:60354 => /172.130.98.95:60248] 
EXCEPTION: java.lang.NoSuchMethodError: 
org.jboss.netty.handler.codec.http.HttpRequest.headers()Lorg/jboss/netty/handler/codec/http/HttpHeaders;)
java.lang.NoSuchMethodError: 
org.jboss.netty.handler.codec.http.HttpResponse.headers()Lorg/jboss/netty/handler/codec/http/HttpHeaders;
at 
org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendError(ShuffleHandler.java:1327)
at 
org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendError(ShuffleHandler.java:1321)
at 
org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendError(ShuffleHandler.java:1316)
at 
org.apache.hadoop.mapred.ShuffleHandler$Shuffle.exceptionCaught(ShuffleHandler.java:1366)
at 
org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142)
at 
org.jboss.netty.handler.codec.frame.FrameDecoder.exceptionCaught(FrameDecoder.java:377)
at 
org.jboss.netty.channel.Channels.fireExceptionCaught(Channels.java:525)
at 
org.jboss.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:48)
at 
org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142)
at 
org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:148)
at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at 
org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at 
org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at 
org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107)
at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{noformat}

> TestExternalTezServices fails on Hadoop3
> 
>
> Key: TEZ-3931
> URL: https://issues.apache.org/jira/browse/TEZ-3931
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
>Priority: Major
> Attachments: TEZ-3931.001.patch
>
>
> In addition, to a netty upgrade needed (TEZ-3902), the dependency for 
> hadoop-mapreduce-client-shuffle needs to be added explicitly.
> {noformat}
> org.apache.tez.tests.TestExternalTezServices.org.apache.tez.tests.TestExternalTezServices
> Failing for the past 1 build (Since Failed#2782 )
> Took 5.4 sec.
> Error Message
> org/apache/hadoop/mapred/ShuffleHandler
> Stacktrace
> java.lang.NoClassDefFoundError: org/apache/hadoop/mapred/ShuffleHandler
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.tez.test.MiniTezCluster.serviceInit(MiniTezCluster.java:185)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.tez.tests.ExternalTezServiceTestHelper.(ExternalTezServiceTestHelper.java:73)
>   at 
> org.apache.tez.tests.TestExternalTezServices.setup(TestExternalTezServices.java:76)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3930) TestDagAwareYarnTaskScheduler fails on Hadoop 3

2018-05-08 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467459#comment-16467459
 ] 

Jonathan Eagles commented on TEZ-3930:
--

+1. The intermittent failure is unrelated (looking into it separately). Going 
to commit this to master and branch-0.9

> TestDagAwareYarnTaskScheduler fails on Hadoop 3
> ---
>
> Key: TEZ-3930
> URL: https://issues.apache.org/jira/browse/TEZ-3930
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Jonathan Eagles
>Assignee: Jason Lowe
>Priority: Major
> Attachments: TEZ-3930.001.patch
>
>
> When scheduler shutdown is called, the AMRMClientAsyncImple serviceStop is 
> invoke, which then interrupts the heartbeat thread and then proceeds to join 
> on the heartbeat thread. The heartbeat thread continues to run and continues 
> to throw NullPointerExceptions. The interrupt doesn't seem to cause the 
> thread to be interrupted now in Hadoop 3 (is YARN-5999 to blame or Tez)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (TEZ-3932) estExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc

2018-05-08 Thread Valencia Edna Serrao (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valencia Edna Serrao updated TEZ-3932:
--
Attachment: org.apache.tez.test.TestExceptionPropagation-output.txt

> estExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc
> 
>
> Key: TEZ-3932
> URL: https://issues.apache.org/jira/browse/TEZ-3932
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.10.0
> Environment: arch: x86 and ppc
> java: openjdk version "1.8.0_161"
>  OpenJDK Runtime Environment (build 1.8.0_161-b14)
>  OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode)
>Reporter: Valencia Edna Serrao
>Priority: Major
>  Labels: ppc, x86
> Attachments: org.apache.tez.test.TestExceptionPropagation-output.txt
>
>
> Test 
> org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession 
> on x86 and ppc. I found related JIRA's TEZ-3746 and TEZ-3748. Though the 
> issue is marked as resolved in the related JIRA's, the issue exists. Below 
> are the error details:
> {code:java}
> ---
> Test set: org.apache.tez.test.TestExceptionPropagation
> ---
> Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 96.433 sec 
> <<< FAILURE!
> testExceptionPropagationSession(org.apache.tez.test.TestExceptionPropagation) 
>  Time elapsed: 52.7 sec  <<< ERROR!
> org.apache.tez.dag.api.SessionNotRunning: Application not running, 
> applicationId=application_1525667420557_0001, yarnApplicationState=FAILED, 
> finalApplicationStatus=FAILED, trackingUrl=N/A, diagnostics=[DAG completed 
> with an ERROR state. Shutting down AM, Session stats:submittedDAGs=11, 
> successfulDAGs=0, failedDAGs=12, killedDAGs=0]
>     at 
> org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910)
>     at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1024)
>     at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:1034)
>     at 
> org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:652)
>     at org.apache.tez.client.TezClient.submitDAG(TezClient.java:588)
>     at 
> org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession(TestExceptionPropagation.java:227
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (TEZ-3932) estExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc

2018-05-08 Thread Valencia Edna Serrao (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valencia Edna Serrao updated TEZ-3932:
--
Labels: ppc x86  (was: )

> estExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc
> 
>
> Key: TEZ-3932
> URL: https://issues.apache.org/jira/browse/TEZ-3932
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.10.0
> Environment: arch: x86 and ppc
> java: openjdk version "1.8.0_161"
>  OpenJDK Runtime Environment (build 1.8.0_161-b14)
>  OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode)
>Reporter: Valencia Edna Serrao
>Priority: Major
>  Labels: ppc, x86
>
> Test 
> org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession 
> on x86 and ppc. I found related JIRA's TEZ-3746 and TEZ-3748. Though the 
> issue is marked as resolved in the related JIRA's, the issue exists. Below 
> are the error details:
> {code:java}
> ---
> Test set: org.apache.tez.test.TestExceptionPropagation
> ---
> Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 96.433 sec 
> <<< FAILURE!
> testExceptionPropagationSession(org.apache.tez.test.TestExceptionPropagation) 
>  Time elapsed: 52.7 sec  <<< ERROR!
> org.apache.tez.dag.api.SessionNotRunning: Application not running, 
> applicationId=application_1525667420557_0001, yarnApplicationState=FAILED, 
> finalApplicationStatus=FAILED, trackingUrl=N/A, diagnostics=[DAG completed 
> with an ERROR state. Shutting down AM, Session stats:submittedDAGs=11, 
> successfulDAGs=0, failedDAGs=12, killedDAGs=0]
>     at 
> org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910)
>     at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1024)
>     at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:1034)
>     at 
> org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:652)
>     at org.apache.tez.client.TezClient.submitDAG(TezClient.java:588)
>     at 
> org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession(TestExceptionPropagation.java:227
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (TEZ-3932) estExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc

2018-05-08 Thread Valencia Edna Serrao (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valencia Edna Serrao updated TEZ-3932:
--
Affects Version/s: 0.10.0

> estExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc
> 
>
> Key: TEZ-3932
> URL: https://issues.apache.org/jira/browse/TEZ-3932
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.10.0
> Environment: arch: x86 and ppc
> java: openjdk version "1.8.0_161"
>  OpenJDK Runtime Environment (build 1.8.0_161-b14)
>  OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode)
>Reporter: Valencia Edna Serrao
>Priority: Major
>  Labels: ppc, x86
>
> Test 
> org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession 
> on x86 and ppc. I found related JIRA's TEZ-3746 and TEZ-3748. Though the 
> issue is marked as resolved in the related JIRA's, the issue exists. Below 
> are the error details:
> {code:java}
> ---
> Test set: org.apache.tez.test.TestExceptionPropagation
> ---
> Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 96.433 sec 
> <<< FAILURE!
> testExceptionPropagationSession(org.apache.tez.test.TestExceptionPropagation) 
>  Time elapsed: 52.7 sec  <<< ERROR!
> org.apache.tez.dag.api.SessionNotRunning: Application not running, 
> applicationId=application_1525667420557_0001, yarnApplicationState=FAILED, 
> finalApplicationStatus=FAILED, trackingUrl=N/A, diagnostics=[DAG completed 
> with an ERROR state. Shutting down AM, Session stats:submittedDAGs=11, 
> successfulDAGs=0, failedDAGs=12, killedDAGs=0]
>     at 
> org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910)
>     at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1024)
>     at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:1034)
>     at 
> org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:652)
>     at org.apache.tez.client.TezClient.submitDAG(TezClient.java:588)
>     at 
> org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession(TestExceptionPropagation.java:227
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3932) estExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc

2018-05-08 Thread Valencia Edna Serrao (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467149#comment-16467149
 ] 

Valencia Edna Serrao commented on TEZ-3932:
---

Thanks [~jeagles] for taking a look at this issue. Here are the details you 
requested:
 # Tez version:0.10.0-SNAPSHOT
 # Frequency: Quite consistent on ppc for past few weeks but didn't occur when 
tried on commit 2e66f3cb2ef082889551f6a0830c7014317d9680. But this week i see 
it on both the arch's.
 # Test failure logs: [^org.apache.tez.test.TestExceptionPropagation-output.txt]

 

Please let me know if any queries.

> estExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc
> 
>
> Key: TEZ-3932
> URL: https://issues.apache.org/jira/browse/TEZ-3932
> Project: Apache Tez
>  Issue Type: Bug
> Environment: arch: x86 and ppc
> java: openjdk version "1.8.0_161"
>  OpenJDK Runtime Environment (build 1.8.0_161-b14)
>  OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode)
>Reporter: Valencia Edna Serrao
>Priority: Major
>
> Test 
> org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession 
> on x86 and ppc. I found related JIRA's TEZ-3746 and TEZ-3748. Though the 
> issue is marked as resolved in the related JIRA's, the issue exists. Below 
> are the error details:
> {code:java}
> ---
> Test set: org.apache.tez.test.TestExceptionPropagation
> ---
> Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 96.433 sec 
> <<< FAILURE!
> testExceptionPropagationSession(org.apache.tez.test.TestExceptionPropagation) 
>  Time elapsed: 52.7 sec  <<< ERROR!
> org.apache.tez.dag.api.SessionNotRunning: Application not running, 
> applicationId=application_1525667420557_0001, yarnApplicationState=FAILED, 
> finalApplicationStatus=FAILED, trackingUrl=N/A, diagnostics=[DAG completed 
> with an ERROR state. Shutting down AM, Session stats:submittedDAGs=11, 
> successfulDAGs=0, failedDAGs=12, killedDAGs=0]
>     at 
> org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910)
>     at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1024)
>     at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:1034)
>     at 
> org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:652)
>     at org.apache.tez.client.TezClient.submitDAG(TezClient.java:588)
>     at 
> org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession(TestExceptionPropagation.java:227
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)