[jira] [Commented] (TEZ-2798) NPE when executing TestMemoryWithEvents::testMemoryScatterGather
[ https://issues.apache.org/jira/browse/TEZ-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053429#comment-15053429 ] Siddharth Seth commented on TEZ-2798: - No NPEs during regular tests - one while shutting down the service. This is in a local run since jenkins is in a mess at the moment. Uploading a patch with the tests ignored and committing. > NPE when executing TestMemoryWithEvents::testMemoryScatterGather > > > Key: TEZ-2798 > URL: https://issues.apache.org/jira/browse/TEZ-2798 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Siddharth Seth >Priority: Blocker > Attachments: TEZ-2798.1.ignoredTemporarilyActivated.txt > > > {noformat} > 2015-09-10 05:07:45,885 ERROR [Dispatcher thread: Central] > common.AsyncDispatcher (AsyncDispatcher.java:dispatch(188)) - Error in > dispatcher thread > java.lang.NullPointerException > at > org.apache.tez.dag.app.ContainerLauncherContextImpl.containerLaunched(ContainerLauncherContextImpl.java:47) > at > org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launch(MockDAGAppMaster.java:280) > at > org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launchContainer(MockDAGAppMaster.java:219) > at > org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:200) > at > org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:46) > at > org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114) > at java.lang.Thread.run(Thread.java:745) > {noformat} > Wasn't caught in jenkins as these tests are very long running tests and are > marked as @Ignore (mainly for internal testing). > Same exception with testMemoryBroadcast, testMemoryOneToOne, > testMemoryRootInputEvents -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2798) NPE when executing TestMemoryWithEvents::testMemoryScatterGather
[ https://issues.apache.org/jira/browse/TEZ-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15048537#comment-15048537 ] TezQA commented on TEZ-2798: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12776521/TEZ-2798.1.ignoredTemporarilyActivated.txt against master revision 5af0604. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.client.TestTezClient Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1358//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1358//console This message is automatically generated. > NPE when executing TestMemoryWithEvents::testMemoryScatterGather > > > Key: TEZ-2798 > URL: https://issues.apache.org/jira/browse/TEZ-2798 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan >Priority: Blocker > Attachments: TEZ-2798.1.ignoredTemporarilyActivated.txt > > > {noformat} > 2015-09-10 05:07:45,885 ERROR [Dispatcher thread: Central] > common.AsyncDispatcher (AsyncDispatcher.java:dispatch(188)) - Error in > dispatcher thread > java.lang.NullPointerException > at > org.apache.tez.dag.app.ContainerLauncherContextImpl.containerLaunched(ContainerLauncherContextImpl.java:47) > at > org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launch(MockDAGAppMaster.java:280) > at > org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launchContainer(MockDAGAppMaster.java:219) > at > org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:200) > at > org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:46) > at > org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114) > at java.lang.Thread.run(Thread.java:745) > {noformat} > Wasn't caught in jenkins as these tests are very long running tests and are > marked as @Ignore (mainly for internal testing). > Same exception with testMemoryBroadcast, testMemoryOneToOne, > testMemoryRootInputEvents -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2798) NPE when executing TestMemoryWithEvents::testMemoryScatterGather
[ https://issues.apache.org/jira/browse/TEZ-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15049214#comment-15049214 ] Siddharth Seth commented on TEZ-2798: - [~bikassaha], [~rajesh.balamohan] - could you please review. The test failure is unrelated. I'll add back the Ignores before committing. > NPE when executing TestMemoryWithEvents::testMemoryScatterGather > > > Key: TEZ-2798 > URL: https://issues.apache.org/jira/browse/TEZ-2798 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Siddharth Seth >Priority: Blocker > Attachments: TEZ-2798.1.ignoredTemporarilyActivated.txt > > > {noformat} > 2015-09-10 05:07:45,885 ERROR [Dispatcher thread: Central] > common.AsyncDispatcher (AsyncDispatcher.java:dispatch(188)) - Error in > dispatcher thread > java.lang.NullPointerException > at > org.apache.tez.dag.app.ContainerLauncherContextImpl.containerLaunched(ContainerLauncherContextImpl.java:47) > at > org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launch(MockDAGAppMaster.java:280) > at > org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launchContainer(MockDAGAppMaster.java:219) > at > org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:200) > at > org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:46) > at > org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114) > at java.lang.Thread.run(Thread.java:745) > {noformat} > Wasn't caught in jenkins as these tests are very long running tests and are > marked as @Ignore (mainly for internal testing). > Same exception with testMemoryBroadcast, testMemoryOneToOne, > testMemoryRootInputEvents -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2798) NPE when executing TestMemoryWithEvents::testMemoryScatterGather
[ https://issues.apache.org/jira/browse/TEZ-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15049467#comment-15049467 ] Bikas Saha commented on TEZ-2798: - The tests need to rerun since it looks like the Jenkins run stopped after the first test failure. Once the MockDAGAppMaster tests have rerun (or in your local run), could you please double check that the output logs from the tests dont have NPEs related to accessing external services. The patch looks good to me. bq. Didn't quite understand this, or the significance w.r.t this jira. The state machines could always run independently This was trying to explain why the other mock dag appmaster tests were passing even though internally, the code was NPEing in the container launcher flow. Since the container launcher was not in the execution state machine flow, the tests continued to pass despite the NPEs. > NPE when executing TestMemoryWithEvents::testMemoryScatterGather > > > Key: TEZ-2798 > URL: https://issues.apache.org/jira/browse/TEZ-2798 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Siddharth Seth >Priority: Blocker > Attachments: TEZ-2798.1.ignoredTemporarilyActivated.txt > > > {noformat} > 2015-09-10 05:07:45,885 ERROR [Dispatcher thread: Central] > common.AsyncDispatcher (AsyncDispatcher.java:dispatch(188)) - Error in > dispatcher thread > java.lang.NullPointerException > at > org.apache.tez.dag.app.ContainerLauncherContextImpl.containerLaunched(ContainerLauncherContextImpl.java:47) > at > org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launch(MockDAGAppMaster.java:280) > at > org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launchContainer(MockDAGAppMaster.java:219) > at > org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:200) > at > org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:46) > at > org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114) > at java.lang.Thread.run(Thread.java:745) > {noformat} > Wasn't caught in jenkins as these tests are very long running tests and are > marked as @Ignore (mainly for internal testing). > Same exception with testMemoryBroadcast, testMemoryOneToOne, > testMemoryRootInputEvents -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2798) NPE when executing TestMemoryWithEvents::testMemoryScatterGather
[ https://issues.apache.org/jira/browse/TEZ-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14741746#comment-14741746 ] Bikas Saha commented on TEZ-2798: - I investigated this. The context passed to containerlaunchercontext is null because its incorrectly passed when the context object is null in the mockdagappmaster constructor. Whenever the launchercontext methods are invoked they NPE on its context member. So when the mockAM launches the mockContainer, there is NPE and the container stays in launching state. TEZ-2045 reversed the flow of sending taskspec to the communicator. This ends up with the side effect that that container lifecycle becomes disconnected from task lifecycle. Even if the container is in launching state, the rest of the task state machine can proceed because there are no further interactions with the AMcontainer object after that (in the no-error case). After the task completes, the local scheduler releases the container and the AMcontainer transitions from Launching to stopped. Again it NPEs when the stop() callback is called. But the rest of the AM code/tests pass. NPE are not crashing the AM because AsyncDispatcher error on exit is set to false. Actually NPE should not be reaching the asyncdispatcher because the containerlaunchermanager should catch exception thrown from service plugin when invoking their methods. In this case, containerlaunchermanager should have caught the exception in plugin.launchContainer() invocation. However, none of the plugin API's actually throw an exception. So the framework code does not catch that exception and we end up ignoring errors. Creating a jira to track that. {code}java.lang.NullPointerException at org.apache.tez.dag.app.ContainerLauncherContextImpl.containerLaunched(ContainerLauncherContextImpl.java:47) at org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launch(MockDAGAppMaster.java:280) at org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launchContainer(MockDAGAppMaster.java:219) at org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:200){code} > NPE when executing TestMemoryWithEvents::testMemoryScatterGather > > > Key: TEZ-2798 > URL: https://issues.apache.org/jira/browse/TEZ-2798 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan >Priority: Blocker > Fix For: 0.8.1 > > > {noformat} > 2015-09-10 05:07:45,885 ERROR [Dispatcher thread: Central] > common.AsyncDispatcher (AsyncDispatcher.java:dispatch(188)) - Error in > dispatcher thread > java.lang.NullPointerException > at > org.apache.tez.dag.app.ContainerLauncherContextImpl.containerLaunched(ContainerLauncherContextImpl.java:47) > at > org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launch(MockDAGAppMaster.java:280) > at > org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launchContainer(MockDAGAppMaster.java:219) > at > org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:200) > at > org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:46) > at > org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114) > at java.lang.Thread.run(Thread.java:745) > {noformat} > Wasn't caught in jenkins as these tests are very long running tests and are > marked as @Ignore (mainly for internal testing). > Same exception with testMemoryBroadcast, testMemoryOneToOne, > testMemoryRootInputEvents -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2798) NPE when executing TestMemoryWithEvents::testMemoryScatterGather
[ https://issues.apache.org/jira/browse/TEZ-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14740023#comment-14740023 ] Bikas Saha commented on TEZ-2798: - Do you have a fix for this? > NPE when executing TestMemoryWithEvents::testMemoryScatterGather > > > Key: TEZ-2798 > URL: https://issues.apache.org/jira/browse/TEZ-2798 > Project: Apache Tez > Issue Type: Bug >Reporter: Rajesh Balamohan > > {noformat} > 2015-09-10 05:07:45,885 ERROR [Dispatcher thread: Central] > common.AsyncDispatcher (AsyncDispatcher.java:dispatch(188)) - Error in > dispatcher thread > java.lang.NullPointerException > at > org.apache.tez.dag.app.ContainerLauncherContextImpl.containerLaunched(ContainerLauncherContextImpl.java:47) > at > org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launch(MockDAGAppMaster.java:280) > at > org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launchContainer(MockDAGAppMaster.java:219) > at > org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:200) > at > org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:46) > at > org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114) > at java.lang.Thread.run(Thread.java:745) > {noformat} > Wasn't caught in jenkins as these tests are very long running tests and are > marked as @Ignore (mainly for internal testing). > Same exception with testMemoryBroadcast, testMemoryOneToOne, > testMemoryRootInputEvents -- This message was sent by Atlassian JIRA (v6.3.4#6332)