[
https://issues.apache.org/jira/browse/TEZ-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15122733#comment-15122733
]
Jeff Zhang commented on TEZ-2307:
---------------------------------
bq. I think it'll be better to move the DAGAppMaster into IDLE state only
after the cleanup is done.
I thought about that. but it would make user confused that the last dag is
completed but he still can not submit another dag due to AM is still in
RUNNING. For now it seems dag clean up won't take too much, have you thought to
put it in DAGImpl.finish ?
> Possible wrong error message when submitting new dag
> ----------------------------------------------------
>
> Key: TEZ-2307
> URL: https://issues.apache.org/jira/browse/TEZ-2307
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Jeff Zhang
> Assignee: Jeff Zhang
> Attachments: TEZ-2307-1.patch, TEZ-2307-2.patch, TEZ-2307-3.patch,
> TEZ-2307-4.patch
>
>
> In the following 2 cases, AM would propagate wrong error message to client
> ("App master already running a DAG")
> * The last dag is completed but AM is still in RUNNING state
> * AM is in shutting down.
> {code}
> 2015-04-10 06:01:50,369 INFO [IPC Server handler 0 on 46821] ipc.Server
> (Server.java:run(2070)) - IPC Server handler 0 on 46821, call
> org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPB.submitDAG
> from 10.0.0.223:48581 Call#411 Retry#0
> org.apache.tez.dag.api.TezException: App master already running a DAG
> at
> org.apache.tez.dag.app.DAGAppMaster.submitDAGToAppMaster(DAGAppMaster.java:1131)
> at
> org.apache.tez.dag.api.client.DAGClientHandler.submitDAG(DAGClientHandler.java:118)
> at
> org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.submitDAG(DAGClientAMProtocolBlockingPBServerImpl.java:163)
> at
> org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:7471)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)