[jira] [Commented] (TEZ-2307) Possible wrong error message when submitting new dag
[ https://issues.apache.org/jira/browse/TEZ-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15127760#comment-15127760 ] TezQA commented on TEZ-2307: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12785703/TEZ-2307-7.patch against master revision 72f5616. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.dag.app.TestSpeculation Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1448//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1448//console This message is automatically generated. > Possible wrong error message when submitting new dag > > > Key: TEZ-2307 > URL: https://issues.apache.org/jira/browse/TEZ-2307 > Project: Apache Tez > Issue Type: Bug >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Attachments: TEZ-2307-1.patch, TEZ-2307-2.patch, TEZ-2307-3.patch, > TEZ-2307-4.patch, TEZ-2307-5.patch, TEZ-2307-6.patch, TEZ-2307-7.patch > > > In the following 2 cases, AM would propagate wrong error message to client > ("App master already running a DAG") > * The last dag is completed but AM is still in RUNNING state > * AM is in shutting down. > {code} > 2015-04-10 06:01:50,369 INFO [IPC Server handler 0 on 46821] ipc.Server > (Server.java:run(2070)) - IPC Server handler 0 on 46821, call > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPB.submitDAG > from 10.0.0.223:48581 Call#411 Retry#0 > org.apache.tez.dag.api.TezException: App master already running a DAG > at > org.apache.tez.dag.app.DAGAppMaster.submitDAGToAppMaster(DAGAppMaster.java:1131) > at > org.apache.tez.dag.api.client.DAGClientHandler.submitDAG(DAGClientHandler.java:118) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.submitDAG(DAGClientAMProtocolBlockingPBServerImpl.java:163) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:7471) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2307) Possible wrong error message when submitting new dag
[ https://issues.apache.org/jira/browse/TEZ-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15127417#comment-15127417 ] Jeff Zhang commented on TEZ-2307: - Thanks [~sseth] Upload new patch to address comments, will commit after then pre-build > Possible wrong error message when submitting new dag > > > Key: TEZ-2307 > URL: https://issues.apache.org/jira/browse/TEZ-2307 > Project: Apache Tez > Issue Type: Bug >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Attachments: TEZ-2307-1.patch, TEZ-2307-2.patch, TEZ-2307-3.patch, > TEZ-2307-4.patch, TEZ-2307-5.patch, TEZ-2307-6.patch > > > In the following 2 cases, AM would propagate wrong error message to client > ("App master already running a DAG") > * The last dag is completed but AM is still in RUNNING state > * AM is in shutting down. > {code} > 2015-04-10 06:01:50,369 INFO [IPC Server handler 0 on 46821] ipc.Server > (Server.java:run(2070)) - IPC Server handler 0 on 46821, call > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPB.submitDAG > from 10.0.0.223:48581 Call#411 Retry#0 > org.apache.tez.dag.api.TezException: App master already running a DAG > at > org.apache.tez.dag.app.DAGAppMaster.submitDAGToAppMaster(DAGAppMaster.java:1131) > at > org.apache.tez.dag.api.client.DAGClientHandler.submitDAG(DAGClientHandler.java:118) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.submitDAG(DAGClientAMProtocolBlockingPBServerImpl.java:163) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:7471) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2307) Possible wrong error message when submitting new dag
[ https://issues.apache.org/jira/browse/TEZ-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15127026#comment-15127026 ] Siddharth Seth commented on TEZ-2307: - +1 Looks good. Minor stuff before the commit - findbugs-exclude no longer required. - please make dagCleanupLock final. Also this could be renamed to idleStateLock > Possible wrong error message when submitting new dag > > > Key: TEZ-2307 > URL: https://issues.apache.org/jira/browse/TEZ-2307 > Project: Apache Tez > Issue Type: Bug >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Attachments: TEZ-2307-1.patch, TEZ-2307-2.patch, TEZ-2307-3.patch, > TEZ-2307-4.patch, TEZ-2307-5.patch > > > In the following 2 cases, AM would propagate wrong error message to client > ("App master already running a DAG") > * The last dag is completed but AM is still in RUNNING state > * AM is in shutting down. > {code} > 2015-04-10 06:01:50,369 INFO [IPC Server handler 0 on 46821] ipc.Server > (Server.java:run(2070)) - IPC Server handler 0 on 46821, call > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPB.submitDAG > from 10.0.0.223:48581 Call#411 Retry#0 > org.apache.tez.dag.api.TezException: App master already running a DAG > at > org.apache.tez.dag.app.DAGAppMaster.submitDAGToAppMaster(DAGAppMaster.java:1131) > at > org.apache.tez.dag.api.client.DAGClientHandler.submitDAG(DAGClientHandler.java:118) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.submitDAG(DAGClientAMProtocolBlockingPBServerImpl.java:163) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:7471) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2307) Possible wrong error message when submitting new dag
[ https://issues.apache.org/jira/browse/TEZ-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15126295#comment-15126295 ] Jeff Zhang commented on TEZ-2307: - Upload a new patch. [~sseth] Please help review. The failed test should be unrelated. > Possible wrong error message when submitting new dag > > > Key: TEZ-2307 > URL: https://issues.apache.org/jira/browse/TEZ-2307 > Project: Apache Tez > Issue Type: Bug >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Attachments: TEZ-2307-1.patch, TEZ-2307-2.patch, TEZ-2307-3.patch, > TEZ-2307-4.patch, TEZ-2307-5.patch > > > In the following 2 cases, AM would propagate wrong error message to client > ("App master already running a DAG") > * The last dag is completed but AM is still in RUNNING state > * AM is in shutting down. > {code} > 2015-04-10 06:01:50,369 INFO [IPC Server handler 0 on 46821] ipc.Server > (Server.java:run(2070)) - IPC Server handler 0 on 46821, call > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPB.submitDAG > from 10.0.0.223:48581 Call#411 Retry#0 > org.apache.tez.dag.api.TezException: App master already running a DAG > at > org.apache.tez.dag.app.DAGAppMaster.submitDAGToAppMaster(DAGAppMaster.java:1131) > at > org.apache.tez.dag.api.client.DAGClientHandler.submitDAG(DAGClientHandler.java:118) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.submitDAG(DAGClientAMProtocolBlockingPBServerImpl.java:163) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:7471) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2307) Possible wrong error message when submitting new dag
[ https://issues.apache.org/jira/browse/TEZ-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15125830#comment-15125830 ] TezQA commented on TEZ-2307: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12785457/TEZ-2307-5.patch against master revision 870972d. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.dag.app.dag.impl.TestDAGImpl org.apache.tez.dag.app.TestMockDAGAppMaster Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1442//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1442//console This message is automatically generated. > Possible wrong error message when submitting new dag > > > Key: TEZ-2307 > URL: https://issues.apache.org/jira/browse/TEZ-2307 > Project: Apache Tez > Issue Type: Bug >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Attachments: TEZ-2307-1.patch, TEZ-2307-2.patch, TEZ-2307-3.patch, > TEZ-2307-4.patch, TEZ-2307-5.patch > > > In the following 2 cases, AM would propagate wrong error message to client > ("App master already running a DAG") > * The last dag is completed but AM is still in RUNNING state > * AM is in shutting down. > {code} > 2015-04-10 06:01:50,369 INFO [IPC Server handler 0 on 46821] ipc.Server > (Server.java:run(2070)) - IPC Server handler 0 on 46821, call > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPB.submitDAG > from 10.0.0.223:48581 Call#411 Retry#0 > org.apache.tez.dag.api.TezException: App master already running a DAG > at > org.apache.tez.dag.app.DAGAppMaster.submitDAGToAppMaster(DAGAppMaster.java:1131) > at > org.apache.tez.dag.api.client.DAGClientHandler.submitDAG(DAGClientHandler.java:118) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.submitDAG(DAGClientAMProtocolBlockingPBServerImpl.java:163) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:7471) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2307) Possible wrong error message when submitting new dag
[ https://issues.apache.org/jira/browse/TEZ-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15123898#comment-15123898 ] Siddharth Seth commented on TEZ-2307: - bq. I thought about that. but it would make user confused that the last dag is completed but he still can not submit another dag due to AM is still in RUNNING. I though this is what this jira is fixing ? Run the new DAG after the previous one is complete, taking into account errors from the new dag and cleanup of the old dag. bq. For now it seems dag clean up won't take too much, have you thought to put it in DAGImpl.finish ? Cleanup sends messages to user plugins. Calling it within finished would mean a dag status look up from the plugins would get the state as RUNNING, instead of the actual final state. DAG_CLEANUP was added as a new state in the DAGAppMaster state machine to allow for any events which are pending in the queue after "DAGAppMasterEventDAGFinished" to get processed. If you think there's no other events there - the DAG_CLEANUP state can be collapsed into DAG_FINISHED - in which case DAGAppMasterState.IDLE will be reached after cleanup. Otherwise, I think it's better to move the transition to the IDLE state into DAG_CLEANUP handling. In either case - notify after the state is IDLE - so that the new submission can proceed after the old dag is cleaned up. > Possible wrong error message when submitting new dag > > > Key: TEZ-2307 > URL: https://issues.apache.org/jira/browse/TEZ-2307 > Project: Apache Tez > Issue Type: Bug >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Attachments: TEZ-2307-1.patch, TEZ-2307-2.patch, TEZ-2307-3.patch, > TEZ-2307-4.patch > > > In the following 2 cases, AM would propagate wrong error message to client > ("App master already running a DAG") > * The last dag is completed but AM is still in RUNNING state > * AM is in shutting down. > {code} > 2015-04-10 06:01:50,369 INFO [IPC Server handler 0 on 46821] ipc.Server > (Server.java:run(2070)) - IPC Server handler 0 on 46821, call > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPB.submitDAG > from 10.0.0.223:48581 Call#411 Retry#0 > org.apache.tez.dag.api.TezException: App master already running a DAG > at > org.apache.tez.dag.app.DAGAppMaster.submitDAGToAppMaster(DAGAppMaster.java:1131) > at > org.apache.tez.dag.api.client.DAGClientHandler.submitDAG(DAGClientHandler.java:118) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.submitDAG(DAGClientAMProtocolBlockingPBServerImpl.java:163) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:7471) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2307) Possible wrong error message when submitting new dag
[ https://issues.apache.org/jira/browse/TEZ-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15122733#comment-15122733 ] Jeff Zhang commented on TEZ-2307: - bq. I think it'll be better to move the DAGAppMaster into IDLE state only after the cleanup is done. I thought about that. but it would make user confused that the last dag is completed but he still can not submit another dag due to AM is still in RUNNING. For now it seems dag clean up won't take too much, have you thought to put it in DAGImpl.finish ? > Possible wrong error message when submitting new dag > > > Key: TEZ-2307 > URL: https://issues.apache.org/jira/browse/TEZ-2307 > Project: Apache Tez > Issue Type: Bug >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Attachments: TEZ-2307-1.patch, TEZ-2307-2.patch, TEZ-2307-3.patch, > TEZ-2307-4.patch > > > In the following 2 cases, AM would propagate wrong error message to client > ("App master already running a DAG") > * The last dag is completed but AM is still in RUNNING state > * AM is in shutting down. > {code} > 2015-04-10 06:01:50,369 INFO [IPC Server handler 0 on 46821] ipc.Server > (Server.java:run(2070)) - IPC Server handler 0 on 46821, call > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPB.submitDAG > from 10.0.0.223:48581 Call#411 Retry#0 > org.apache.tez.dag.api.TezException: App master already running a DAG > at > org.apache.tez.dag.app.DAGAppMaster.submitDAGToAppMaster(DAGAppMaster.java:1131) > at > org.apache.tez.dag.api.client.DAGClientHandler.submitDAG(DAGClientHandler.java:118) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.submitDAG(DAGClientAMProtocolBlockingPBServerImpl.java:163) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:7471) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2307) Possible wrong error message when submitting new dag
[ https://issues.apache.org/jira/browse/TEZ-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15122063#comment-15122063 ] Siddharth Seth commented on TEZ-2307: - bq. I think make the submit RPC call wait might not be a good option because it is confused that user can not submit new dag even after previous dag is completed. So I suggest that user can still submit new dag, but keep the dag in NEW state until the cleanup of previous dag is done. This is an option. Couple of things which will need to be considered though. The user will consider submitDag as successful. What happens if there's an error during the cleanup of the previous DAG ? That would have to be sent back as part of dag status monitoring. This can get fairly confusing for users - DAG accepted, but then notified about failure due to a cleanup error from the previous DAG. On the patch itself. Instead of using a field - dagCleanupDone, I think it'll be better to move the DAGAppMaster into IDLE state only after the cleanup is done. My bad here, I should have fixed this in the patch which added the cleanup state. submitDag can wait on the DAG entering IDLE state instead of waiting on dagCleanup. A notification can be sent out once the DAG enters cleanup state. This also gets rid of the call from DAGImpl to set the dagCleanupedFlag to false. - In the current patch, calling setDagCleanupDone races with handling of the DAGCleanupEvent if concurrent dispatchers are used. It'd be better to avoid this for when we support concurrent dispatchers as the default. - A boolean field (maybe volatile) is sufficient instead of an AtomicBoolean since we're synchronizing on it. > Possible wrong error message when submitting new dag > > > Key: TEZ-2307 > URL: https://issues.apache.org/jira/browse/TEZ-2307 > Project: Apache Tez > Issue Type: Bug >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Attachments: TEZ-2307-1.patch, TEZ-2307-2.patch, TEZ-2307-3.patch, > TEZ-2307-4.patch > > > In the following 2 cases, AM would propagate wrong error message to client > ("App master already running a DAG") > * The last dag is completed but AM is still in RUNNING state > * AM is in shutting down. > {code} > 2015-04-10 06:01:50,369 INFO [IPC Server handler 0 on 46821] ipc.Server > (Server.java:run(2070)) - IPC Server handler 0 on 46821, call > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPB.submitDAG > from 10.0.0.223:48581 Call#411 Retry#0 > org.apache.tez.dag.api.TezException: App master already running a DAG > at > org.apache.tez.dag.app.DAGAppMaster.submitDAGToAppMaster(DAGAppMaster.java:1131) > at > org.apache.tez.dag.api.client.DAGClientHandler.submitDAG(DAGClientHandler.java:118) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.submitDAG(DAGClientAMProtocolBlockingPBServerImpl.java:163) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:7471) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2307) Possible wrong error message when submitting new dag
[ https://issues.apache.org/jira/browse/TEZ-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15121595#comment-15121595 ] TezQA commented on TEZ-2307: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12784914/TEZ-2307-4.patch against master revision 2bf27de. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1438//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1438//console This message is automatically generated. > Possible wrong error message when submitting new dag > > > Key: TEZ-2307 > URL: https://issues.apache.org/jira/browse/TEZ-2307 > Project: Apache Tez > Issue Type: Bug >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Attachments: TEZ-2307-1.patch, TEZ-2307-2.patch, TEZ-2307-3.patch, > TEZ-2307-4.patch > > > In the following 2 cases, AM would propagate wrong error message to client > ("App master already running a DAG") > * The last dag is completed but AM is still in RUNNING state > * AM is in shutting down. > {code} > 2015-04-10 06:01:50,369 INFO [IPC Server handler 0 on 46821] ipc.Server > (Server.java:run(2070)) - IPC Server handler 0 on 46821, call > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPB.submitDAG > from 10.0.0.223:48581 Call#411 Retry#0 > org.apache.tez.dag.api.TezException: App master already running a DAG > at > org.apache.tez.dag.app.DAGAppMaster.submitDAGToAppMaster(DAGAppMaster.java:1131) > at > org.apache.tez.dag.api.client.DAGClientHandler.submitDAG(DAGClientHandler.java:118) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.submitDAG(DAGClientAMProtocolBlockingPBServerImpl.java:163) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:7471) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2307) Possible wrong error message when submitting new dag
[ https://issues.apache.org/jira/browse/TEZ-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15121370#comment-15121370 ] TezQA commented on TEZ-2307: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12784898/TEZ-2307-3.patch against master revision 2bf27de. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1437//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/1437//artifact/patchprocess/newPatchFindbugsWarningstez-dag.html Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1437//console This message is automatically generated. > Possible wrong error message when submitting new dag > > > Key: TEZ-2307 > URL: https://issues.apache.org/jira/browse/TEZ-2307 > Project: Apache Tez > Issue Type: Bug >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Attachments: TEZ-2307-1.patch, TEZ-2307-2.patch, TEZ-2307-3.patch > > > In the following 2 cases, AM would propagate wrong error message to client > ("App master already running a DAG") > * The last dag is completed but AM is still in RUNNING state > * AM is in shutting down. > {code} > 2015-04-10 06:01:50,369 INFO [IPC Server handler 0 on 46821] ipc.Server > (Server.java:run(2070)) - IPC Server handler 0 on 46821, call > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPB.submitDAG > from 10.0.0.223:48581 Call#411 Retry#0 > org.apache.tez.dag.api.TezException: App master already running a DAG > at > org.apache.tez.dag.app.DAGAppMaster.submitDAGToAppMaster(DAGAppMaster.java:1131) > at > org.apache.tez.dag.api.client.DAGClientHandler.submitDAG(DAGClientHandler.java:118) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.submitDAG(DAGClientAMProtocolBlockingPBServerImpl.java:163) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:7471) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2307) Possible wrong error message when submitting new dag
[ https://issues.apache.org/jira/browse/TEZ-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15121148#comment-15121148 ] TezQA commented on TEZ-2307: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12784857/TEZ-2307-2.patch against master revision 2bf27de. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 findbugs{color}. The patch appears to introduce 3 new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1436//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/1436//artifact/patchprocess/newPatchFindbugsWarningstez-dag.html Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1436//console This message is automatically generated. > Possible wrong error message when submitting new dag > > > Key: TEZ-2307 > URL: https://issues.apache.org/jira/browse/TEZ-2307 > Project: Apache Tez > Issue Type: Bug >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Attachments: TEZ-2307-1.patch, TEZ-2307-2.patch > > > In the following 2 cases, AM would propagate wrong error message to client > ("App master already running a DAG") > * The last dag is completed but AM is still in RUNNING state > * AM is in shutting down. > {code} > 2015-04-10 06:01:50,369 INFO [IPC Server handler 0 on 46821] ipc.Server > (Server.java:run(2070)) - IPC Server handler 0 on 46821, call > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPB.submitDAG > from 10.0.0.223:48581 Call#411 Retry#0 > org.apache.tez.dag.api.TezException: App master already running a DAG > at > org.apache.tez.dag.app.DAGAppMaster.submitDAGToAppMaster(DAGAppMaster.java:1131) > at > org.apache.tez.dag.api.client.DAGClientHandler.submitDAG(DAGClientHandler.java:118) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.submitDAG(DAGClientAMProtocolBlockingPBServerImpl.java:163) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:7471) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2307) Possible wrong error message when submitting new dag
[ https://issues.apache.org/jira/browse/TEZ-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15120974#comment-15120974 ] Jeff Zhang commented on TEZ-2307: - Attach a new patch. [~sseth] Please help review. * This patch has one drawback that it would make the dag submission RPC block there if the previous dag cleanup is not done. But I suppose it would not take too much time for the dag clean up, (we can add timeout if necessary) * In the method of DAGImpl.finish, it needs to set the dagCleanupDone flag, otherwise the next dag submission may not know whether the cleanup of previous dag is done. > Possible wrong error message when submitting new dag > > > Key: TEZ-2307 > URL: https://issues.apache.org/jira/browse/TEZ-2307 > Project: Apache Tez > Issue Type: Bug >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Attachments: TEZ-2307-1.patch, TEZ-2307-2.patch > > > In the following 2 cases, AM would propagate wrong error message to client > ("App master already running a DAG") > * The last dag is completed but AM is still in RUNNING state > * AM is in shutting down. > {code} > 2015-04-10 06:01:50,369 INFO [IPC Server handler 0 on 46821] ipc.Server > (Server.java:run(2070)) - IPC Server handler 0 on 46821, call > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPB.submitDAG > from 10.0.0.223:48581 Call#411 Retry#0 > org.apache.tez.dag.api.TezException: App master already running a DAG > at > org.apache.tez.dag.app.DAGAppMaster.submitDAGToAppMaster(DAGAppMaster.java:1131) > at > org.apache.tez.dag.api.client.DAGClientHandler.submitDAG(DAGClientHandler.java:118) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.submitDAG(DAGClientAMProtocolBlockingPBServerImpl.java:163) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:7471) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2307) Possible wrong error message when submitting new dag
[ https://issues.apache.org/jira/browse/TEZ-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15118448#comment-15118448 ] Jeff Zhang commented on TEZ-2307: - I think make the submit RPC call wait might not be a good option because it is confused that user can not submit new dag even after previous dag is completed. So I suggest that user can still submit new dag, but keep the dag in NEW state until the cleanup of previous dag is done. [~sseth] What do you think ? > Possible wrong error message when submitting new dag > > > Key: TEZ-2307 > URL: https://issues.apache.org/jira/browse/TEZ-2307 > Project: Apache Tez > Issue Type: Bug >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Attachments: TEZ-2307-1.patch > > > In the following 2 cases, AM would propagate wrong error message to client > ("App master already running a DAG") > * The last dag is completed but AM is still in RUNNING state > * AM is in shutting down. > {code} > 2015-04-10 06:01:50,369 INFO [IPC Server handler 0 on 46821] ipc.Server > (Server.java:run(2070)) - IPC Server handler 0 on 46821, call > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPB.submitDAG > from 10.0.0.223:48581 Call#411 Retry#0 > org.apache.tez.dag.api.TezException: App master already running a DAG > at > org.apache.tez.dag.app.DAGAppMaster.submitDAGToAppMaster(DAGAppMaster.java:1131) > at > org.apache.tez.dag.api.client.DAGClientHandler.submitDAG(DAGClientHandler.java:118) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.submitDAG(DAGClientAMProtocolBlockingPBServerImpl.java:163) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:7471) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2307) Possible wrong error message when submitting new dag
[ https://issues.apache.org/jira/browse/TEZ-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15090008#comment-15090008 ] Siddharth Seth commented on TEZ-2307: - [~zjffdu] - not sure if this patch is ready for review yet or not. I was looking at the code - and I think there's another problem around the way the state transitions happen. If a dag is accepted before the AM transitions all it's states - there's a possibility that the DAG_FINISHED event and the subsequent DAG_CLEANUP have not been processed. If DAG_CLEANUP is processed after a new DAG is submitted, we may see additional errors with that DAG - since cleanup notifies components about the previous dag finishing, and also empties the ID caches. This could result in all kinds of strange errors with the newly submitted DAG. There's a small chance that synchronization is taking care of this - but I have my doubts, since 'submitDAG' holds the lock on the AppMaster - so just allowing a new DAG to be submitted may guarantee out of order execution of the previous DAGs cleanup - instead of throwing the exception that it throws today. I think we need to make the new DAG submission wait till the previous DAG has been cleaned up. > Possible wrong error message when submitting new dag > > > Key: TEZ-2307 > URL: https://issues.apache.org/jira/browse/TEZ-2307 > Project: Apache Tez > Issue Type: Bug >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Attachments: TEZ-2307-1.patch > > > In the following 2 cases, AM would propagate wrong error message to client > ("App master already running a DAG") > * The last dag is completed but AM is still in RUNNING state > * AM is in shutting down. > {code} > 2015-04-10 06:01:50,369 INFO [IPC Server handler 0 on 46821] ipc.Server > (Server.java:run(2070)) - IPC Server handler 0 on 46821, call > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPB.submitDAG > from 10.0.0.223:48581 Call#411 Retry#0 > org.apache.tez.dag.api.TezException: App master already running a DAG > at > org.apache.tez.dag.app.DAGAppMaster.submitDAGToAppMaster(DAGAppMaster.java:1131) > at > org.apache.tez.dag.api.client.DAGClientHandler.submitDAG(DAGClientHandler.java:118) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.submitDAG(DAGClientAMProtocolBlockingPBServerImpl.java:163) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:7471) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2307) Possible wrong error message when submitting new dag
[ https://issues.apache.org/jira/browse/TEZ-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15089028#comment-15089028 ] TezQA commented on TEZ-2307: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12781162/TEZ-2307-1.patch against master revision d5c9649. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.dag.history.ats.acls.TestATSHistoryWithACLs The following test timeouts occurred in : org.apache.tez.test.TestRecovery Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1409//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1409//console This message is automatically generated. > Possible wrong error message when submitting new dag > > > Key: TEZ-2307 > URL: https://issues.apache.org/jira/browse/TEZ-2307 > Project: Apache Tez > Issue Type: Bug >Reporter: Jeff Zhang >Assignee: Jeff Zhang > Fix For: 0.7.1, 0.8.2 > > Attachments: TEZ-2307-1.patch > > > In the following 2 cases, AM would propagate wrong error message to client > ("App master already running a DAG") > * The last dag is completed but AM is still in RUNNING state > * AM is in shutting down. > {code} > 2015-04-10 06:01:50,369 INFO [IPC Server handler 0 on 46821] ipc.Server > (Server.java:run(2070)) - IPC Server handler 0 on 46821, call > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPB.submitDAG > from 10.0.0.223:48581 Call#411 Retry#0 > org.apache.tez.dag.api.TezException: App master already running a DAG > at > org.apache.tez.dag.app.DAGAppMaster.submitDAGToAppMaster(DAGAppMaster.java:1131) > at > org.apache.tez.dag.api.client.DAGClientHandler.submitDAG(DAGClientHandler.java:118) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.submitDAG(DAGClientAMProtocolBlockingPBServerImpl.java:163) > at > org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:7471) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)