[jira] [Commented] (TEZ-2798) NPE when executing TestMemoryWithEvents::testMemoryScatterGather

2015-12-11 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053429#comment-15053429
 ] 

Siddharth Seth commented on TEZ-2798:
-

No NPEs during regular tests - one while shutting down the service. This is in 
a local run since jenkins is in a mess at the moment. Uploading a patch with 
the tests ignored and committing.

> NPE when executing TestMemoryWithEvents::testMemoryScatterGather
> 
>
> Key: TEZ-2798
> URL: https://issues.apache.org/jira/browse/TEZ-2798
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Siddharth Seth
>Priority: Blocker
> Attachments: TEZ-2798.1.ignoredTemporarilyActivated.txt
>
>
> {noformat}
> 2015-09-10 05:07:45,885 ERROR [Dispatcher thread: Central] 
> common.AsyncDispatcher (AsyncDispatcher.java:dispatch(188)) - Error in 
> dispatcher thread
> java.lang.NullPointerException
>   at 
> org.apache.tez.dag.app.ContainerLauncherContextImpl.containerLaunched(ContainerLauncherContextImpl.java:47)
>   at 
> org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launch(MockDAGAppMaster.java:280)
>   at 
> org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launchContainer(MockDAGAppMaster.java:219)
>   at 
> org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:200)
>   at 
> org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:46)
>   at 
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
>   at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Wasn't caught in jenkins as these tests are very long running tests and are 
> marked as @Ignore (mainly for internal testing).
> Same exception with testMemoryBroadcast, testMemoryOneToOne, 
> testMemoryRootInputEvents



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2798) NPE when executing TestMemoryWithEvents::testMemoryScatterGather

2015-12-09 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15048537#comment-15048537
 ] 

TezQA commented on TEZ-2798:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  
http://issues.apache.org/jira/secure/attachment/12776521/TEZ-2798.1.ignoredTemporarilyActivated.txt
  against master revision 5af0604.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.client.TestTezClient

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1358//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1358//console

This message is automatically generated.

> NPE when executing TestMemoryWithEvents::testMemoryScatterGather
> 
>
> Key: TEZ-2798
> URL: https://issues.apache.org/jira/browse/TEZ-2798
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Priority: Blocker
> Attachments: TEZ-2798.1.ignoredTemporarilyActivated.txt
>
>
> {noformat}
> 2015-09-10 05:07:45,885 ERROR [Dispatcher thread: Central] 
> common.AsyncDispatcher (AsyncDispatcher.java:dispatch(188)) - Error in 
> dispatcher thread
> java.lang.NullPointerException
>   at 
> org.apache.tez.dag.app.ContainerLauncherContextImpl.containerLaunched(ContainerLauncherContextImpl.java:47)
>   at 
> org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launch(MockDAGAppMaster.java:280)
>   at 
> org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launchContainer(MockDAGAppMaster.java:219)
>   at 
> org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:200)
>   at 
> org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:46)
>   at 
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
>   at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Wasn't caught in jenkins as these tests are very long running tests and are 
> marked as @Ignore (mainly for internal testing).
> Same exception with testMemoryBroadcast, testMemoryOneToOne, 
> testMemoryRootInputEvents



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2798) NPE when executing TestMemoryWithEvents::testMemoryScatterGather

2015-12-09 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15049214#comment-15049214
 ] 

Siddharth Seth commented on TEZ-2798:
-

[~bikassaha], [~rajesh.balamohan] - could you please review. The test failure 
is unrelated. I'll add back the Ignores before committing.

> NPE when executing TestMemoryWithEvents::testMemoryScatterGather
> 
>
> Key: TEZ-2798
> URL: https://issues.apache.org/jira/browse/TEZ-2798
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Siddharth Seth
>Priority: Blocker
> Attachments: TEZ-2798.1.ignoredTemporarilyActivated.txt
>
>
> {noformat}
> 2015-09-10 05:07:45,885 ERROR [Dispatcher thread: Central] 
> common.AsyncDispatcher (AsyncDispatcher.java:dispatch(188)) - Error in 
> dispatcher thread
> java.lang.NullPointerException
>   at 
> org.apache.tez.dag.app.ContainerLauncherContextImpl.containerLaunched(ContainerLauncherContextImpl.java:47)
>   at 
> org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launch(MockDAGAppMaster.java:280)
>   at 
> org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launchContainer(MockDAGAppMaster.java:219)
>   at 
> org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:200)
>   at 
> org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:46)
>   at 
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
>   at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Wasn't caught in jenkins as these tests are very long running tests and are 
> marked as @Ignore (mainly for internal testing).
> Same exception with testMemoryBroadcast, testMemoryOneToOne, 
> testMemoryRootInputEvents



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2798) NPE when executing TestMemoryWithEvents::testMemoryScatterGather

2015-12-09 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15049467#comment-15049467
 ] 

Bikas Saha commented on TEZ-2798:
-

The tests need to rerun since it looks like the Jenkins run stopped after the 
first test failure. Once the MockDAGAppMaster tests have rerun (or in your 
local run), could you please double check that the output logs from the tests 
dont have NPEs related to accessing external services.

The patch looks good to me. 

bq. Didn't quite understand this, or the significance w.r.t this jira. The 
state machines could always run independently
This was trying to explain why the other mock dag appmaster tests were passing 
even though internally, the code was NPEing in the container launcher flow. 
Since the container launcher was not in the execution state machine flow, the 
tests continued to pass despite the NPEs.


> NPE when executing TestMemoryWithEvents::testMemoryScatterGather
> 
>
> Key: TEZ-2798
> URL: https://issues.apache.org/jira/browse/TEZ-2798
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Siddharth Seth
>Priority: Blocker
> Attachments: TEZ-2798.1.ignoredTemporarilyActivated.txt
>
>
> {noformat}
> 2015-09-10 05:07:45,885 ERROR [Dispatcher thread: Central] 
> common.AsyncDispatcher (AsyncDispatcher.java:dispatch(188)) - Error in 
> dispatcher thread
> java.lang.NullPointerException
>   at 
> org.apache.tez.dag.app.ContainerLauncherContextImpl.containerLaunched(ContainerLauncherContextImpl.java:47)
>   at 
> org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launch(MockDAGAppMaster.java:280)
>   at 
> org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launchContainer(MockDAGAppMaster.java:219)
>   at 
> org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:200)
>   at 
> org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:46)
>   at 
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
>   at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Wasn't caught in jenkins as these tests are very long running tests and are 
> marked as @Ignore (mainly for internal testing).
> Same exception with testMemoryBroadcast, testMemoryOneToOne, 
> testMemoryRootInputEvents



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2798) NPE when executing TestMemoryWithEvents::testMemoryScatterGather

2015-09-11 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14741746#comment-14741746
 ] 

Bikas Saha commented on TEZ-2798:
-

I investigated this. 

The context passed to containerlaunchercontext is null because its incorrectly 
passed when the context object is null in the mockdagappmaster constructor. 
Whenever the launchercontext methods are invoked they NPE on its context 
member. So when the mockAM launches the mockContainer, there is NPE and the 
container stays in launching state.

TEZ-2045 reversed the flow of sending taskspec to the communicator. This ends 
up with the side effect that that container lifecycle becomes disconnected from 
task lifecycle. Even if the container is in launching state, the rest of the 
task state machine can proceed because there are no further interactions with 
the AMcontainer object after that (in the no-error case).

After the task completes, the local scheduler releases the container and the 
AMcontainer transitions from Launching to stopped. Again it NPEs when the 
stop() callback is called. But the rest of the AM code/tests pass.

NPE are not crashing the AM because AsyncDispatcher error on exit is set to 
false. Actually NPE should not be reaching the asyncdispatcher because the 
containerlaunchermanager should catch exception thrown from service plugin when 
invoking their methods. In this case, containerlaunchermanager should have 
caught the exception in plugin.launchContainer() invocation. However, none of 
the plugin API's actually throw an exception. So the framework code does not 
catch that exception and we end up ignoring errors. Creating a jira to track 
that.
{code}java.lang.NullPointerException
at 
org.apache.tez.dag.app.ContainerLauncherContextImpl.containerLaunched(ContainerLauncherContextImpl.java:47)
at 
org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launch(MockDAGAppMaster.java:280)
at 
org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launchContainer(MockDAGAppMaster.java:219)
at 
org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:200){code}

> NPE when executing TestMemoryWithEvents::testMemoryScatterGather
> 
>
> Key: TEZ-2798
> URL: https://issues.apache.org/jira/browse/TEZ-2798
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Priority: Blocker
> Fix For: 0.8.1
>
>
> {noformat}
> 2015-09-10 05:07:45,885 ERROR [Dispatcher thread: Central] 
> common.AsyncDispatcher (AsyncDispatcher.java:dispatch(188)) - Error in 
> dispatcher thread
> java.lang.NullPointerException
>   at 
> org.apache.tez.dag.app.ContainerLauncherContextImpl.containerLaunched(ContainerLauncherContextImpl.java:47)
>   at 
> org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launch(MockDAGAppMaster.java:280)
>   at 
> org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launchContainer(MockDAGAppMaster.java:219)
>   at 
> org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:200)
>   at 
> org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:46)
>   at 
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
>   at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Wasn't caught in jenkins as these tests are very long running tests and are 
> marked as @Ignore (mainly for internal testing).
> Same exception with testMemoryBroadcast, testMemoryOneToOne, 
> testMemoryRootInputEvents



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2798) NPE when executing TestMemoryWithEvents::testMemoryScatterGather

2015-09-10 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14740023#comment-14740023
 ] 

Bikas Saha commented on TEZ-2798:
-

Do you have a fix for this?

> NPE when executing TestMemoryWithEvents::testMemoryScatterGather
> 
>
> Key: TEZ-2798
> URL: https://issues.apache.org/jira/browse/TEZ-2798
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>
> {noformat}
> 2015-09-10 05:07:45,885 ERROR [Dispatcher thread: Central] 
> common.AsyncDispatcher (AsyncDispatcher.java:dispatch(188)) - Error in 
> dispatcher thread
> java.lang.NullPointerException
>   at 
> org.apache.tez.dag.app.ContainerLauncherContextImpl.containerLaunched(ContainerLauncherContextImpl.java:47)
>   at 
> org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launch(MockDAGAppMaster.java:280)
>   at 
> org.apache.tez.dag.app.MockDAGAppMaster$MockContainerLauncher.launchContainer(MockDAGAppMaster.java:219)
>   at 
> org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:200)
>   at 
> org.apache.tez.dag.app.launcher.ContainerLauncherManager.handle(ContainerLauncherManager.java:46)
>   at 
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
>   at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Wasn't caught in jenkins as these tests are very long running tests and are 
> marked as @Ignore (mainly for internal testing).
> Same exception with testMemoryBroadcast, testMemoryOneToOne, 
> testMemoryRootInputEvents



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)