[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.
[ https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16468102#comment-16468102 ] TezQA commented on TEZ-3911: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12922532/TEZ-3911.007.patch against master revision 081a64f. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 27 javac compiler warnings (more than the master's current 24 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2795//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/2795//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2795//console This message is automatically generated. > Optional min/max/avg aggr. task counters reported to HistoryLoggingService at > final counter aggr. > - > > Key: TEZ-3911 > URL: https://issues.apache.org/jira/browse/TEZ-3911 > Project: Apache Tez > Issue Type: New Feature >Reporter: Eric Wohlstadter >Assignee: Vineet Garg >Priority: Critical > Fix For: 0.9.next > > Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch, > TEZ-3911.003.patch, TEZ-3911.004.patch, TEZ-3911.005.patch, > TEZ-3911.006.patch, TEZ-3911.007.patch > > > Consumers of HistoryLoggingService reported counters are currently required > to compute any task-level aggregations other than "sum". This is inefficient > as Tez is already "scanning" over this data. Computing incremental aggregates > shouldn't require additional scans by ATS consumers. > Provide an option for Task counter aggregations other than "sum". Computation > of these extra counters can be turned on/off. > The option will generate "synthetic" counters at final aggregation time for > reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. > Only incremental aggregations will be supported (min/max/avg). Aggregation > computation will be folded into the existing "aggregation loop" beginning at > VertexImpl.incrTaskCounters. > Extra aggregations will only be supported during final counter aggregation. > Aggregations will only include the "bestAttempt" for each task. > A design doc will be provided. > Because final task aggregation holds a lock, a performance report will be > provided. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Failed: TEZ-3911 PreCommit Build #2795
Jira: https://issues.apache.org/jira/browse/TEZ-3911 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2795/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 353.48 KB...] [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 01:01 h [INFO] Finished at: 2018-05-08T23:23:09Z [INFO] Final Memory: 83M/1360M [INFO] {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12922532/TEZ-3911.007.patch against master revision 081a64f. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 27 javac compiler warnings (more than the master's current 24 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2795//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/2795//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2795//console This message is automatically generated. == == Adding comment to Jira. == == == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [Fast Archiver] Compressed 3.61 MB of artifacts by 26.8% relative to #2791 [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## All tests passed
Failed: TEZ-3933 PreCommit Build #2796
Jira: https://issues.apache.org/jira/browse/TEZ-3933 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2796/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 342.92 KB...] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [Help 2] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :tez-dag [INFO] Build failures were ignored. {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12922528/TEZ-3933.001.patch against master revision 081a64f. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.dag.app.TestSpeculation Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2796//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2796//console This message is automatically generated. == == Adding comment to Jira. == == == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [Fast Archiver] Compressed 3.59 MB of artifacts by 12.2% relative to #2791 [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## 1 tests failed. FAILED: org.apache.tez.dag.app.TestSpeculation.testBasicSpeculationWithProgress Error Message: expected:<2> but was:<3> Stack Trace: java.lang.AssertionError: expected:<2> but was:<3> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.tez.dag.app.TestSpeculation.testBasicSpeculation(TestSpeculation.java:172) at org.apache.tez.dag.app.TestSpeculation.testBasicSpeculationWithProgress(TestSpeculation.java:193)
[jira] [Commented] (TEZ-3933) Remove sleep from test TestExceptionPropagation
[ https://issues.apache.org/jira/browse/TEZ-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16468096#comment-16468096 ] TezQA commented on TEZ-3933: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12922528/TEZ-3933.001.patch against master revision 081a64f. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.dag.app.TestSpeculation Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2796//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2796//console This message is automatically generated. > Remove sleep from test TestExceptionPropagation > --- > > Key: TEZ-3933 > URL: https://issues.apache.org/jira/browse/TEZ-3933 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles >Priority: Major > Attachments: TEZ-3933.001.patch > > > While investigating TEZ-3932, it was found that the test suite takes nearly 2 > minutes to run. After removing the sleep, test suite now takes 40 seconds. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3932) TaskSchedulerManager can throw NullPointerException during DAGAppMaster container cleanup race
[ https://issues.apache.org/jira/browse/TEZ-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16468042#comment-16468042 ] Jonathan Eagles commented on TEZ-3932: -- [~jlowe], can you have a look at this NullPointerProtection patch? Test failure is due to an unrelated timeout (probably should be bumped higher) > TaskSchedulerManager can throw NullPointerException during DAGAppMaster > container cleanup race > -- > > Key: TEZ-3932 > URL: https://issues.apache.org/jira/browse/TEZ-3932 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.10.0 > Environment: arch: x86 and ppc > java: openjdk version "1.8.0_161" > OpenJDK Runtime Environment (build 1.8.0_161-b14) > OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode) >Reporter: Valencia Edna Serrao >Assignee: Jonathan Eagles >Priority: Major > Labels: ppc, x86 > Attachments: TEZ-3932.001.patch, TEZ-3932.fail.patch, > org.apache.tez.test.TestExceptionPropagation-output.txt > > > Test > org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession > on x86 and ppc. I found related JIRA's TEZ-3746 and TEZ-3748. Though the > issue is marked as resolved in the related JIRA's, the issue exists. Below > are the error details: > {code:java} > --- > Test set: org.apache.tez.test.TestExceptionPropagation > --- > Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 96.433 sec > <<< FAILURE! > testExceptionPropagationSession(org.apache.tez.test.TestExceptionPropagation) > Time elapsed: 52.7 sec <<< ERROR! > org.apache.tez.dag.api.SessionNotRunning: Application not running, > applicationId=application_1525667420557_0001, yarnApplicationState=FAILED, > finalApplicationStatus=FAILED, trackingUrl=N/A, diagnostics=[DAG completed > with an ERROR state. Shutting down AM, Session stats:submittedDAGs=11, > successfulDAGs=0, failedDAGs=12, killedDAGs=0] > at > org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910) > at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1024) > at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:1034) > at > org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:652) > at org.apache.tez.client.TezClient.submitDAG(TezClient.java:588) > at > org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession(TestExceptionPropagation.java:227 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3932) TaskSchedulerManager can throw NullPointerException during DAGAppMaster container cleanup race
[ https://issues.apache.org/jira/browse/TEZ-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16468033#comment-16468033 ] TezQA commented on TEZ-3932: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12922523/TEZ-3932.001.patch against master revision 081a64f. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.runtime.library.conf.TestUnorderedPartitionedKVEdgeConfig Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2794//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2794//console This message is automatically generated. > TaskSchedulerManager can throw NullPointerException during DAGAppMaster > container cleanup race > -- > > Key: TEZ-3932 > URL: https://issues.apache.org/jira/browse/TEZ-3932 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.10.0 > Environment: arch: x86 and ppc > java: openjdk version "1.8.0_161" > OpenJDK Runtime Environment (build 1.8.0_161-b14) > OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode) >Reporter: Valencia Edna Serrao >Assignee: Jonathan Eagles >Priority: Major > Labels: ppc, x86 > Attachments: TEZ-3932.001.patch, TEZ-3932.fail.patch, > org.apache.tez.test.TestExceptionPropagation-output.txt > > > Test > org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession > on x86 and ppc. I found related JIRA's TEZ-3746 and TEZ-3748. Though the > issue is marked as resolved in the related JIRA's, the issue exists. Below > are the error details: > {code:java} > --- > Test set: org.apache.tez.test.TestExceptionPropagation > --- > Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 96.433 sec > <<< FAILURE! > testExceptionPropagationSession(org.apache.tez.test.TestExceptionPropagation) > Time elapsed: 52.7 sec <<< ERROR! > org.apache.tez.dag.api.SessionNotRunning: Application not running, > applicationId=application_1525667420557_0001, yarnApplicationState=FAILED, > finalApplicationStatus=FAILED, trackingUrl=N/A, diagnostics=[DAG completed > with an ERROR state. Shutting down AM, Session stats:submittedDAGs=11, > successfulDAGs=0, failedDAGs=12, killedDAGs=0] > at > org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910) > at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1024) > at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:1034) > at > org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:652) > at org.apache.tez.client.TezClient.submitDAG(TezClient.java:588) > at > org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession(TestExceptionPropagation.java:227 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Failed: TEZ-3932 PreCommit Build #2794
Jira: https://issues.apache.org/jira/browse/TEZ-3932 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/2794/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 342.77 KB...] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [Help 2] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :tez-runtime-library [INFO] Build failures were ignored. {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12922523/TEZ-3932.001.patch against master revision 081a64f. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.runtime.library.conf.TestUnorderedPartitionedKVEdgeConfig Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2794//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2794//console This message is automatically generated. == == Adding comment to Jira. == == == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [Fast Archiver] Compressed 3.59 MB of artifacts by 10.4% relative to #2791 [description-setter] Could not determine description. Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any ### ## FAILED TESTS (if any) ## 1 tests failed. FAILED: org.apache.tez.runtime.library.conf.TestUnorderedPartitionedKVEdgeConfig.testDefaultConfigsUsed Error Message: test timed out after 2000 milliseconds Stack Trace: java.lang.Exception: test timed out after 2000 milliseconds at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:255) at sun.misc.Resource.getBytes(Resource.java:124) at java.net.URLClassLoader.defineClass(URLClassLoader.java:462) at java.net.URLClassLoader.access$100(URLClassLoader.java:73) at java.net.URLClassLoader$1.run(URLClassLoader.java:368) at java.net.URLClassLoader$1.run(URLClassLoader.java:362) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:361) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at org.apache.tez.runtime.library.conf.UnorderedPartitionedKVEdgeConfig$Builder.(UnorderedPartitionedKVEdgeConfig.java:170) at org.apache.tez.runtime.library.conf.UnorderedPartitionedKVEdgeConfig.newBuilder(UnorderedPartitionedKVEdgeConfig.java:79) at org.apache.tez.runtime.library.conf.UnorderedPartitionedKVEdgeConfig.newBuilder(UnorderedPartitionedKVEdgeConfig.java:92) at org.apache.tez.runtime.library.conf.TestUnorderedPartitionedKVEdgeConfig.testDefaultConfigsUsed(TestUnorderedPartitionedKVEdgeConfig.java:69)
[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.
[ https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467994#comment-16467994 ] Vineet Garg commented on TEZ-3911: -- Latest patch (007) addresses review comment. > Optional min/max/avg aggr. task counters reported to HistoryLoggingService at > final counter aggr. > - > > Key: TEZ-3911 > URL: https://issues.apache.org/jira/browse/TEZ-3911 > Project: Apache Tez > Issue Type: New Feature >Reporter: Eric Wohlstadter >Assignee: Vineet Garg >Priority: Critical > Fix For: 0.9.next > > Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch, > TEZ-3911.003.patch, TEZ-3911.004.patch, TEZ-3911.005.patch, > TEZ-3911.006.patch, TEZ-3911.007.patch > > > Consumers of HistoryLoggingService reported counters are currently required > to compute any task-level aggregations other than "sum". This is inefficient > as Tez is already "scanning" over this data. Computing incremental aggregates > shouldn't require additional scans by ATS consumers. > Provide an option for Task counter aggregations other than "sum". Computation > of these extra counters can be turned on/off. > The option will generate "synthetic" counters at final aggregation time for > reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. > Only incremental aggregations will be supported (min/max/avg). Aggregation > computation will be folded into the existing "aggregation loop" beginning at > VertexImpl.incrTaskCounters. > Extra aggregations will only be supported during final counter aggregation. > Aggregations will only include the "bestAttempt" for each task. > A design doc will be provided. > Because final task aggregation holds a lock, a performance report will be > provided. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.
[ https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated TEZ-3911: - Attachment: TEZ-3911.007.patch > Optional min/max/avg aggr. task counters reported to HistoryLoggingService at > final counter aggr. > - > > Key: TEZ-3911 > URL: https://issues.apache.org/jira/browse/TEZ-3911 > Project: Apache Tez > Issue Type: New Feature >Reporter: Eric Wohlstadter >Assignee: Vineet Garg >Priority: Critical > Fix For: 0.9.next > > Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch, > TEZ-3911.003.patch, TEZ-3911.004.patch, TEZ-3911.005.patch, > TEZ-3911.006.patch, TEZ-3911.007.patch > > > Consumers of HistoryLoggingService reported counters are currently required > to compute any task-level aggregations other than "sum". This is inefficient > as Tez is already "scanning" over this data. Computing incremental aggregates > shouldn't require additional scans by ATS consumers. > Provide an option for Task counter aggregations other than "sum". Computation > of these extra counters can be turned on/off. > The option will generate "synthetic" counters at final aggregation time for > reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. > Only incremental aggregations will be supported (min/max/avg). Aggregation > computation will be folded into the existing "aggregation loop" beginning at > VertexImpl.incrTaskCounters. > Extra aggregations will only be supported during final counter aggregation. > Aggregations will only include the "bestAttempt" for each task. > A design doc will be provided. > Because final task aggregation holds a lock, a performance report will be > provided. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3933) Remove sleep from test TestExceptionPropagation
[ https://issues.apache.org/jira/browse/TEZ-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated TEZ-3933: - Description: While investigating TEZ-3932, it was found that the test suite takes nearly 2 minutes to run. After removing the sleep, test suite now takes 40 seconds. > Remove sleep from test TestExceptionPropagation > --- > > Key: TEZ-3933 > URL: https://issues.apache.org/jira/browse/TEZ-3933 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles >Priority: Major > Attachments: TEZ-3933.001.patch > > > While investigating TEZ-3932, it was found that the test suite takes nearly 2 > minutes to run. After removing the sleep, test suite now takes 40 seconds. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3933) Remove sleep from test TestExceptionPropagation
[ https://issues.apache.org/jira/browse/TEZ-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated TEZ-3933: - Attachment: TEZ-3933.001.patch > Remove sleep from test TestExceptionPropagation > --- > > Key: TEZ-3933 > URL: https://issues.apache.org/jira/browse/TEZ-3933 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Priority: Major > Attachments: TEZ-3933.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (TEZ-3933) Remove sleep from test TestExceptionPropagation
[ https://issues.apache.org/jira/browse/TEZ-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles reassigned TEZ-3933: Assignee: Jonathan Eagles > Remove sleep from test TestExceptionPropagation > --- > > Key: TEZ-3933 > URL: https://issues.apache.org/jira/browse/TEZ-3933 > Project: Apache Tez > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles >Priority: Major > Attachments: TEZ-3933.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (TEZ-3933) Remove sleep from test TestExceptionPropagation
Jonathan Eagles created TEZ-3933: Summary: Remove sleep from test TestExceptionPropagation Key: TEZ-3933 URL: https://issues.apache.org/jira/browse/TEZ-3933 Project: Apache Tez Issue Type: Bug Reporter: Jonathan Eagles -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3932) TaskSchedulerManager can throw NullPointerException during DAGAppMaster container cleanup race
[ https://issues.apache.org/jira/browse/TEZ-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467951#comment-16467951 ] Jonathan Eagles commented on TEZ-3932: -- [~vserrao], thank you for providing the test logs as I was able to create a reliable test case that reproduces this issue. I was able to create an initial patch that will remove this intermittent issue you have been facing and I will work with the community to get this checked in. This logs show that this is not just a test issue but could happen in practice during shutdown scenarios. > TaskSchedulerManager can throw NullPointerException during DAGAppMaster > container cleanup race > -- > > Key: TEZ-3932 > URL: https://issues.apache.org/jira/browse/TEZ-3932 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.10.0 > Environment: arch: x86 and ppc > java: openjdk version "1.8.0_161" > OpenJDK Runtime Environment (build 1.8.0_161-b14) > OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode) >Reporter: Valencia Edna Serrao >Assignee: Jonathan Eagles >Priority: Major > Labels: ppc, x86 > Attachments: TEZ-3932.001.patch, TEZ-3932.fail.patch, > org.apache.tez.test.TestExceptionPropagation-output.txt > > > Test > org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession > on x86 and ppc. I found related JIRA's TEZ-3746 and TEZ-3748. Though the > issue is marked as resolved in the related JIRA's, the issue exists. Below > are the error details: > {code:java} > --- > Test set: org.apache.tez.test.TestExceptionPropagation > --- > Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 96.433 sec > <<< FAILURE! > testExceptionPropagationSession(org.apache.tez.test.TestExceptionPropagation) > Time elapsed: 52.7 sec <<< ERROR! > org.apache.tez.dag.api.SessionNotRunning: Application not running, > applicationId=application_1525667420557_0001, yarnApplicationState=FAILED, > finalApplicationStatus=FAILED, trackingUrl=N/A, diagnostics=[DAG completed > with an ERROR state. Shutting down AM, Session stats:submittedDAGs=11, > successfulDAGs=0, failedDAGs=12, killedDAGs=0] > at > org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910) > at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1024) > at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:1034) > at > org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:652) > at org.apache.tez.client.TezClient.submitDAG(TezClient.java:588) > at > org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession(TestExceptionPropagation.java:227 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3932) TaskSchedulerManager can throw NullPointerException during DAGAppMaster container cleanup race
[ https://issues.apache.org/jira/browse/TEZ-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated TEZ-3932: - Summary: TaskSchedulerManager can throw NullPointerException during DAGAppMaster container cleanup race (was: TestExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc) > TaskSchedulerManager can throw NullPointerException during DAGAppMaster > container cleanup race > -- > > Key: TEZ-3932 > URL: https://issues.apache.org/jira/browse/TEZ-3932 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.10.0 > Environment: arch: x86 and ppc > java: openjdk version "1.8.0_161" > OpenJDK Runtime Environment (build 1.8.0_161-b14) > OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode) >Reporter: Valencia Edna Serrao >Priority: Major > Labels: ppc, x86 > Attachments: TEZ-3932.001.patch, TEZ-3932.fail.patch, > org.apache.tez.test.TestExceptionPropagation-output.txt > > > Test > org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession > on x86 and ppc. I found related JIRA's TEZ-3746 and TEZ-3748. Though the > issue is marked as resolved in the related JIRA's, the issue exists. Below > are the error details: > {code:java} > --- > Test set: org.apache.tez.test.TestExceptionPropagation > --- > Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 96.433 sec > <<< FAILURE! > testExceptionPropagationSession(org.apache.tez.test.TestExceptionPropagation) > Time elapsed: 52.7 sec <<< ERROR! > org.apache.tez.dag.api.SessionNotRunning: Application not running, > applicationId=application_1525667420557_0001, yarnApplicationState=FAILED, > finalApplicationStatus=FAILED, trackingUrl=N/A, diagnostics=[DAG completed > with an ERROR state. Shutting down AM, Session stats:submittedDAGs=11, > successfulDAGs=0, failedDAGs=12, killedDAGs=0] > at > org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910) > at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1024) > at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:1034) > at > org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:652) > at org.apache.tez.client.TezClient.submitDAG(TezClient.java:588) > at > org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession(TestExceptionPropagation.java:227 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (TEZ-3932) TaskSchedulerManager can throw NullPointerException during DAGAppMaster container cleanup race
[ https://issues.apache.org/jira/browse/TEZ-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles reassigned TEZ-3932: Assignee: Jonathan Eagles > TaskSchedulerManager can throw NullPointerException during DAGAppMaster > container cleanup race > -- > > Key: TEZ-3932 > URL: https://issues.apache.org/jira/browse/TEZ-3932 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.10.0 > Environment: arch: x86 and ppc > java: openjdk version "1.8.0_161" > OpenJDK Runtime Environment (build 1.8.0_161-b14) > OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode) >Reporter: Valencia Edna Serrao >Assignee: Jonathan Eagles >Priority: Major > Labels: ppc, x86 > Attachments: TEZ-3932.001.patch, TEZ-3932.fail.patch, > org.apache.tez.test.TestExceptionPropagation-output.txt > > > Test > org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession > on x86 and ppc. I found related JIRA's TEZ-3746 and TEZ-3748. Though the > issue is marked as resolved in the related JIRA's, the issue exists. Below > are the error details: > {code:java} > --- > Test set: org.apache.tez.test.TestExceptionPropagation > --- > Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 96.433 sec > <<< FAILURE! > testExceptionPropagationSession(org.apache.tez.test.TestExceptionPropagation) > Time elapsed: 52.7 sec <<< ERROR! > org.apache.tez.dag.api.SessionNotRunning: Application not running, > applicationId=application_1525667420557_0001, yarnApplicationState=FAILED, > finalApplicationStatus=FAILED, trackingUrl=N/A, diagnostics=[DAG completed > with an ERROR state. Shutting down AM, Session stats:submittedDAGs=11, > successfulDAGs=0, failedDAGs=12, killedDAGs=0] > at > org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910) > at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1024) > at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:1034) > at > org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:652) > at org.apache.tez.client.TezClient.submitDAG(TezClient.java:588) > at > org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession(TestExceptionPropagation.java:227 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3932) TestExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc
[ https://issues.apache.org/jira/browse/TEZ-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated TEZ-3932: - Attachment: TEZ-3932.001.patch > TestExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc > - > > Key: TEZ-3932 > URL: https://issues.apache.org/jira/browse/TEZ-3932 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.10.0 > Environment: arch: x86 and ppc > java: openjdk version "1.8.0_161" > OpenJDK Runtime Environment (build 1.8.0_161-b14) > OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode) >Reporter: Valencia Edna Serrao >Priority: Major > Labels: ppc, x86 > Attachments: TEZ-3932.001.patch, TEZ-3932.fail.patch, > org.apache.tez.test.TestExceptionPropagation-output.txt > > > Test > org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession > on x86 and ppc. I found related JIRA's TEZ-3746 and TEZ-3748. Though the > issue is marked as resolved in the related JIRA's, the issue exists. Below > are the error details: > {code:java} > --- > Test set: org.apache.tez.test.TestExceptionPropagation > --- > Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 96.433 sec > <<< FAILURE! > testExceptionPropagationSession(org.apache.tez.test.TestExceptionPropagation) > Time elapsed: 52.7 sec <<< ERROR! > org.apache.tez.dag.api.SessionNotRunning: Application not running, > applicationId=application_1525667420557_0001, yarnApplicationState=FAILED, > finalApplicationStatus=FAILED, trackingUrl=N/A, diagnostics=[DAG completed > with an ERROR state. Shutting down AM, Session stats:submittedDAGs=11, > successfulDAGs=0, failedDAGs=12, killedDAGs=0] > at > org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910) > at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1024) > at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:1034) > at > org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:652) > at org.apache.tez.client.TezClient.submitDAG(TezClient.java:588) > at > org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession(TestExceptionPropagation.java:227 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3911) Optional min/max/avg aggr. task counters reported to HistoryLoggingService at final counter aggr.
[ https://issues.apache.org/jira/browse/TEZ-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467835#comment-16467835 ] Gopal V commented on TEZ-3911: -- The code LGTM - +1 pending change to test to generate a diff min/max for vertex A and vertex B. {code} +Assert.assertEquals(1, ((AggregateTezCounterDelegate)vBCounters.findCounter(globalCounterName, globalCounterName)).getMin()); +Assert.assertEquals(1, ((AggregateTezCounterDelegate)vBCounters.findCounter(globalCounterName, globalCounterName)).getMax()); {code} so that you would get 1,2 there instead of 1,1 > Optional min/max/avg aggr. task counters reported to HistoryLoggingService at > final counter aggr. > - > > Key: TEZ-3911 > URL: https://issues.apache.org/jira/browse/TEZ-3911 > Project: Apache Tez > Issue Type: New Feature >Reporter: Eric Wohlstadter >Assignee: Vineet Garg >Priority: Critical > Fix For: 0.9.next > > Attachments: TEZ-3911.001.patch, TEZ-3911.002.patch, > TEZ-3911.003.patch, TEZ-3911.004.patch, TEZ-3911.005.patch, TEZ-3911.006.patch > > > Consumers of HistoryLoggingService reported counters are currently required > to compute any task-level aggregations other than "sum". This is inefficient > as Tez is already "scanning" over this data. Computing incremental aggregates > shouldn't require additional scans by ATS consumers. > Provide an option for Task counter aggregations other than "sum". Computation > of these extra counters can be turned on/off. > The option will generate "synthetic" counters at final aggregation time for > reporting to HistoryLoggingService, e.g. MAX_GC_TIME_MILLIS. > Only incremental aggregations will be supported (min/max/avg). Aggregation > computation will be folded into the existing "aggregation loop" beginning at > VertexImpl.incrTaskCounters. > Extra aggregations will only be supported during final counter aggregation. > Aggregations will only include the "bestAttempt" for each task. > A design doc will be provided. > Because final task aggregation holds a lock, a performance report will be > provided. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3932) TestExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc
[ https://issues.apache.org/jira/browse/TEZ-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated TEZ-3932: - Attachment: TEZ-3932.fail.patch > TestExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc > - > > Key: TEZ-3932 > URL: https://issues.apache.org/jira/browse/TEZ-3932 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.10.0 > Environment: arch: x86 and ppc > java: openjdk version "1.8.0_161" > OpenJDK Runtime Environment (build 1.8.0_161-b14) > OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode) >Reporter: Valencia Edna Serrao >Priority: Major > Labels: ppc, x86 > Attachments: TEZ-3932.fail.patch, > org.apache.tez.test.TestExceptionPropagation-output.txt > > > Test > org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession > on x86 and ppc. I found related JIRA's TEZ-3746 and TEZ-3748. Though the > issue is marked as resolved in the related JIRA's, the issue exists. Below > are the error details: > {code:java} > --- > Test set: org.apache.tez.test.TestExceptionPropagation > --- > Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 96.433 sec > <<< FAILURE! > testExceptionPropagationSession(org.apache.tez.test.TestExceptionPropagation) > Time elapsed: 52.7 sec <<< ERROR! > org.apache.tez.dag.api.SessionNotRunning: Application not running, > applicationId=application_1525667420557_0001, yarnApplicationState=FAILED, > finalApplicationStatus=FAILED, trackingUrl=N/A, diagnostics=[DAG completed > with an ERROR state. Shutting down AM, Session stats:submittedDAGs=11, > successfulDAGs=0, failedDAGs=12, killedDAGs=0] > at > org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910) > at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1024) > at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:1034) > at > org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:652) > at org.apache.tez.client.TezClient.submitDAG(TezClient.java:588) > at > org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession(TestExceptionPropagation.java:227 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3932) TestExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc
[ https://issues.apache.org/jira/browse/TEZ-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated TEZ-3932: - Summary: TestExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc (was: estExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc) > TestExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc > - > > Key: TEZ-3932 > URL: https://issues.apache.org/jira/browse/TEZ-3932 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.10.0 > Environment: arch: x86 and ppc > java: openjdk version "1.8.0_161" > OpenJDK Runtime Environment (build 1.8.0_161-b14) > OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode) >Reporter: Valencia Edna Serrao >Priority: Major > Labels: ppc, x86 > Attachments: org.apache.tez.test.TestExceptionPropagation-output.txt > > > Test > org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession > on x86 and ppc. I found related JIRA's TEZ-3746 and TEZ-3748. Though the > issue is marked as resolved in the related JIRA's, the issue exists. Below > are the error details: > {code:java} > --- > Test set: org.apache.tez.test.TestExceptionPropagation > --- > Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 96.433 sec > <<< FAILURE! > testExceptionPropagationSession(org.apache.tez.test.TestExceptionPropagation) > Time elapsed: 52.7 sec <<< ERROR! > org.apache.tez.dag.api.SessionNotRunning: Application not running, > applicationId=application_1525667420557_0001, yarnApplicationState=FAILED, > finalApplicationStatus=FAILED, trackingUrl=N/A, diagnostics=[DAG completed > with an ERROR state. Shutting down AM, Session stats:submittedDAGs=11, > successfulDAGs=0, failedDAGs=12, killedDAGs=0] > at > org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910) > at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1024) > at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:1034) > at > org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:652) > at org.apache.tez.client.TezClient.submitDAG(TezClient.java:588) > at > org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession(TestExceptionPropagation.java:227 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3931) TestExternalTezServices fails on Hadoop3
[ https://issues.apache.org/jira/browse/TEZ-3931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467585#comment-16467585 ] Kuhu Shukla commented on TEZ-3931: -- The patch looks good to me [~jeagles]! The -1 for no tests included is acceptable here. Committing this to master and branch-0.9 shortly. > TestExternalTezServices fails on Hadoop3 > > > Key: TEZ-3931 > URL: https://issues.apache.org/jira/browse/TEZ-3931 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles >Priority: Major > Attachments: TEZ-3931.001.patch > > > In addition, to a netty upgrade needed (TEZ-3902), the dependency for > hadoop-mapreduce-client-shuffle needs to be added explicitly. > {noformat} > org.apache.tez.tests.TestExternalTezServices.org.apache.tez.tests.TestExternalTezServices > Failing for the past 1 build (Since Failed#2782 ) > Took 5.4 sec. > Error Message > org/apache/hadoop/mapred/ShuffleHandler > Stacktrace > java.lang.NoClassDefFoundError: org/apache/hadoop/mapred/ShuffleHandler > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at > org.apache.tez.test.MiniTezCluster.serviceInit(MiniTezCluster.java:185) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.tez.tests.ExternalTezServiceTestHelper.(ExternalTezServiceTestHelper.java:73) > at > org.apache.tez.tests.TestExternalTezServices.setup(TestExternalTezServices.java:76) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (TEZ-3931) TestExternalTezServices fails on Hadoop3
[ https://issues.apache.org/jira/browse/TEZ-3931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467501#comment-16467501 ] Jonathan Eagles edited comment on TEZ-3931 at 5/8/18 2:50 PM: -- The following command shows the patch fixes the above startup dependency issue. {noformat} mvn clean test -Dtest=TestExternalTezServices -Dhadoop.version=3.0.2 -Phadoop28 -P-hadoop27 -pl '!tez-ui' {noformat} This still needs an improved fix from TEZ-3902 but the JIRA needs to be put in first. {noformat} WARNING: An exception was thrown by a user handler while handling an exception event ([id: 0x922dfc31, /172.130.98.95:60354 => /172.130.98.95:60248] EXCEPTION: java.lang.NoSuchMethodError: org.jboss.netty.handler.codec.http.HttpRequest.headers()Lorg/jboss/netty/handler/codec/http/HttpHeaders;) java.lang.NoSuchMethodError: org.jboss.netty.handler.codec.http.HttpResponse.headers()Lorg/jboss/netty/handler/codec/http/HttpHeaders; at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendError(ShuffleHandler.java:1327) at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendError(ShuffleHandler.java:1321) at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendError(ShuffleHandler.java:1316) at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.exceptionCaught(ShuffleHandler.java:1366) at org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142) at org.jboss.netty.handler.codec.frame.FrameDecoder.exceptionCaught(FrameDecoder.java:377) at org.jboss.netty.channel.Channels.fireExceptionCaught(Channels.java:525) at org.jboss.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:48) at org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142) at org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:148) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459) at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536) at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88) at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {noformat} was (Author: jeagles): The following command shows the patch fixes the above startup dependency issue. {noformat} mvn clean test -Dtest=TestExternalTezServices -Dhadoop.version=3.0.2 -Phadoop28 -P-hadoop27 -pl '!tez-ui' {noformat} This still needs an improved fix from TEZ-3902 {noformat} WARNING: An exception was thrown by a user handler while handling an exception event ([id: 0x922dfc31, /172.130.98.95:60354 => /172.130.98.95:60248] EXCEPTION: java.lang.NoSuchMethodError: org.jboss.netty.handler.codec.http.HttpRequest.headers()Lorg/jboss/netty/handler/codec/http/HttpHeaders;) java.lang.NoSuchMethodError: org.jboss.netty.handler.codec.http.HttpResponse.headers()Lorg/jboss/netty/handler/codec/http/HttpHeaders; at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendError(ShuffleHandler.java:1327) at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendError(ShuffleHandler.java:1321) at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendError(ShuffleHandler.java:1316) at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.exceptionCaught(ShuffleHandler.java:1366) at org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142) at org.jboss.netty.handler.codec.frame.FrameDecoder.exceptionCaught(FrameDecoder.java:377) at org.jboss.netty.channel.Channels.fireExceptionCaught(Channels.java:525) at org.jboss.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:48) at org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142) at
[jira] [Commented] (TEZ-3931) TestExternalTezServices fails on Hadoop3
[ https://issues.apache.org/jira/browse/TEZ-3931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467501#comment-16467501 ] Jonathan Eagles commented on TEZ-3931: -- The following command shows the patch fixes the above startup dependency issue. {noformat} mvn clean test -Dtest=TestExternalTezServices -Dhadoop.version=3.0.2 -Phadoop28 -P-hadoop27 -pl '!tez-ui' {noformat} This still needs an improved fix from TEZ-3902 {noformat} WARNING: An exception was thrown by a user handler while handling an exception event ([id: 0x922dfc31, /172.130.98.95:60354 => /172.130.98.95:60248] EXCEPTION: java.lang.NoSuchMethodError: org.jboss.netty.handler.codec.http.HttpRequest.headers()Lorg/jboss/netty/handler/codec/http/HttpHeaders;) java.lang.NoSuchMethodError: org.jboss.netty.handler.codec.http.HttpResponse.headers()Lorg/jboss/netty/handler/codec/http/HttpHeaders; at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendError(ShuffleHandler.java:1327) at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendError(ShuffleHandler.java:1321) at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendError(ShuffleHandler.java:1316) at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.exceptionCaught(ShuffleHandler.java:1366) at org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142) at org.jboss.netty.handler.codec.frame.FrameDecoder.exceptionCaught(FrameDecoder.java:377) at org.jboss.netty.channel.Channels.fireExceptionCaught(Channels.java:525) at org.jboss.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:48) at org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142) at org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:148) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459) at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536) at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88) at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {noformat} > TestExternalTezServices fails on Hadoop3 > > > Key: TEZ-3931 > URL: https://issues.apache.org/jira/browse/TEZ-3931 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles >Priority: Major > Attachments: TEZ-3931.001.patch > > > In addition, to a netty upgrade needed (TEZ-3902), the dependency for > hadoop-mapreduce-client-shuffle needs to be added explicitly. > {noformat} > org.apache.tez.tests.TestExternalTezServices.org.apache.tez.tests.TestExternalTezServices > Failing for the past 1 build (Since Failed#2782 ) > Took 5.4 sec. > Error Message > org/apache/hadoop/mapred/ShuffleHandler > Stacktrace > java.lang.NoClassDefFoundError: org/apache/hadoop/mapred/ShuffleHandler > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at > org.apache.tez.test.MiniTezCluster.serviceInit(MiniTezCluster.java:185) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.tez.tests.ExternalTezServiceTestHelper.(ExternalTezServiceTestHelper.java:73) > at > org.apache.tez.tests.TestExternalTezServices.setup(TestExternalTezServices.java:76) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3930) TestDagAwareYarnTaskScheduler fails on Hadoop 3
[ https://issues.apache.org/jira/browse/TEZ-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467459#comment-16467459 ] Jonathan Eagles commented on TEZ-3930: -- +1. The intermittent failure is unrelated (looking into it separately). Going to commit this to master and branch-0.9 > TestDagAwareYarnTaskScheduler fails on Hadoop 3 > --- > > Key: TEZ-3930 > URL: https://issues.apache.org/jira/browse/TEZ-3930 > Project: Apache Tez > Issue Type: Sub-task >Reporter: Jonathan Eagles >Assignee: Jason Lowe >Priority: Major > Attachments: TEZ-3930.001.patch > > > When scheduler shutdown is called, the AMRMClientAsyncImple serviceStop is > invoke, which then interrupts the heartbeat thread and then proceeds to join > on the heartbeat thread. The heartbeat thread continues to run and continues > to throw NullPointerExceptions. The interrupt doesn't seem to cause the > thread to be interrupted now in Hadoop 3 (is YARN-5999 to blame or Tez) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3932) estExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc
[ https://issues.apache.org/jira/browse/TEZ-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Valencia Edna Serrao updated TEZ-3932: -- Attachment: org.apache.tez.test.TestExceptionPropagation-output.txt > estExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc > > > Key: TEZ-3932 > URL: https://issues.apache.org/jira/browse/TEZ-3932 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.10.0 > Environment: arch: x86 and ppc > java: openjdk version "1.8.0_161" > OpenJDK Runtime Environment (build 1.8.0_161-b14) > OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode) >Reporter: Valencia Edna Serrao >Priority: Major > Labels: ppc, x86 > Attachments: org.apache.tez.test.TestExceptionPropagation-output.txt > > > Test > org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession > on x86 and ppc. I found related JIRA's TEZ-3746 and TEZ-3748. Though the > issue is marked as resolved in the related JIRA's, the issue exists. Below > are the error details: > {code:java} > --- > Test set: org.apache.tez.test.TestExceptionPropagation > --- > Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 96.433 sec > <<< FAILURE! > testExceptionPropagationSession(org.apache.tez.test.TestExceptionPropagation) > Time elapsed: 52.7 sec <<< ERROR! > org.apache.tez.dag.api.SessionNotRunning: Application not running, > applicationId=application_1525667420557_0001, yarnApplicationState=FAILED, > finalApplicationStatus=FAILED, trackingUrl=N/A, diagnostics=[DAG completed > with an ERROR state. Shutting down AM, Session stats:submittedDAGs=11, > successfulDAGs=0, failedDAGs=12, killedDAGs=0] > at > org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910) > at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1024) > at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:1034) > at > org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:652) > at org.apache.tez.client.TezClient.submitDAG(TezClient.java:588) > at > org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession(TestExceptionPropagation.java:227 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3932) estExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc
[ https://issues.apache.org/jira/browse/TEZ-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Valencia Edna Serrao updated TEZ-3932: -- Labels: ppc x86 (was: ) > estExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc > > > Key: TEZ-3932 > URL: https://issues.apache.org/jira/browse/TEZ-3932 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.10.0 > Environment: arch: x86 and ppc > java: openjdk version "1.8.0_161" > OpenJDK Runtime Environment (build 1.8.0_161-b14) > OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode) >Reporter: Valencia Edna Serrao >Priority: Major > Labels: ppc, x86 > > Test > org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession > on x86 and ppc. I found related JIRA's TEZ-3746 and TEZ-3748. Though the > issue is marked as resolved in the related JIRA's, the issue exists. Below > are the error details: > {code:java} > --- > Test set: org.apache.tez.test.TestExceptionPropagation > --- > Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 96.433 sec > <<< FAILURE! > testExceptionPropagationSession(org.apache.tez.test.TestExceptionPropagation) > Time elapsed: 52.7 sec <<< ERROR! > org.apache.tez.dag.api.SessionNotRunning: Application not running, > applicationId=application_1525667420557_0001, yarnApplicationState=FAILED, > finalApplicationStatus=FAILED, trackingUrl=N/A, diagnostics=[DAG completed > with an ERROR state. Shutting down AM, Session stats:submittedDAGs=11, > successfulDAGs=0, failedDAGs=12, killedDAGs=0] > at > org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910) > at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1024) > at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:1034) > at > org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:652) > at org.apache.tez.client.TezClient.submitDAG(TezClient.java:588) > at > org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession(TestExceptionPropagation.java:227 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TEZ-3932) estExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc
[ https://issues.apache.org/jira/browse/TEZ-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Valencia Edna Serrao updated TEZ-3932: -- Affects Version/s: 0.10.0 > estExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc > > > Key: TEZ-3932 > URL: https://issues.apache.org/jira/browse/TEZ-3932 > Project: Apache Tez > Issue Type: Bug >Affects Versions: 0.10.0 > Environment: arch: x86 and ppc > java: openjdk version "1.8.0_161" > OpenJDK Runtime Environment (build 1.8.0_161-b14) > OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode) >Reporter: Valencia Edna Serrao >Priority: Major > Labels: ppc, x86 > > Test > org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession > on x86 and ppc. I found related JIRA's TEZ-3746 and TEZ-3748. Though the > issue is marked as resolved in the related JIRA's, the issue exists. Below > are the error details: > {code:java} > --- > Test set: org.apache.tez.test.TestExceptionPropagation > --- > Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 96.433 sec > <<< FAILURE! > testExceptionPropagationSession(org.apache.tez.test.TestExceptionPropagation) > Time elapsed: 52.7 sec <<< ERROR! > org.apache.tez.dag.api.SessionNotRunning: Application not running, > applicationId=application_1525667420557_0001, yarnApplicationState=FAILED, > finalApplicationStatus=FAILED, trackingUrl=N/A, diagnostics=[DAG completed > with an ERROR state. Shutting down AM, Session stats:submittedDAGs=11, > successfulDAGs=0, failedDAGs=12, killedDAGs=0] > at > org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910) > at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1024) > at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:1034) > at > org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:652) > at org.apache.tez.client.TezClient.submitDAG(TezClient.java:588) > at > org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession(TestExceptionPropagation.java:227 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TEZ-3932) estExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc
[ https://issues.apache.org/jira/browse/TEZ-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467149#comment-16467149 ] Valencia Edna Serrao commented on TEZ-3932: --- Thanks [~jeagles] for taking a look at this issue. Here are the details you requested: # Tez version:0.10.0-SNAPSHOT # Frequency: Quite consistent on ppc for past few weeks but didn't occur when tried on commit 2e66f3cb2ef082889551f6a0830c7014317d9680. But this week i see it on both the arch's. # Test failure logs: [^org.apache.tez.test.TestExceptionPropagation-output.txt] Please let me know if any queries. > estExceptionPropagation#testExceptionPropagationSession fails on x86 and ppc > > > Key: TEZ-3932 > URL: https://issues.apache.org/jira/browse/TEZ-3932 > Project: Apache Tez > Issue Type: Bug > Environment: arch: x86 and ppc > java: openjdk version "1.8.0_161" > OpenJDK Runtime Environment (build 1.8.0_161-b14) > OpenJDK 64-Bit Server VM (build 25.161-b14, mixed mode) >Reporter: Valencia Edna Serrao >Priority: Major > > Test > org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession > on x86 and ppc. I found related JIRA's TEZ-3746 and TEZ-3748. Though the > issue is marked as resolved in the related JIRA's, the issue exists. Below > are the error details: > {code:java} > --- > Test set: org.apache.tez.test.TestExceptionPropagation > --- > Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 96.433 sec > <<< FAILURE! > testExceptionPropagationSession(org.apache.tez.test.TestExceptionPropagation) > Time elapsed: 52.7 sec <<< ERROR! > org.apache.tez.dag.api.SessionNotRunning: Application not running, > applicationId=application_1525667420557_0001, yarnApplicationState=FAILED, > finalApplicationStatus=FAILED, trackingUrl=N/A, diagnostics=[DAG completed > with an ERROR state. Shutting down AM, Session stats:submittedDAGs=11, > successfulDAGs=0, failedDAGs=12, killedDAGs=0] > at > org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910) > at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1024) > at org.apache.tez.client.TezClient.waitForProxy(TezClient.java:1034) > at > org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:652) > at org.apache.tez.client.TezClient.submitDAG(TezClient.java:588) > at > org.apache.tez.test.TestExceptionPropagation.testExceptionPropagationSession(TestExceptionPropagation.java:227 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)