[ https://issues.apache.org/jira/browse/TEZ-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14243479#comment-14243479 ]
Jeff Zhang commented on TEZ-1273: --------------------------------- bq. Are you saying that TezClient::stop was invoked when a DAG was running and the AM does not go into a KILLED state? That is a bug. However, if DAGClient::tryKillDAG was killed, then only the DAG should be killed with the session remaining in a running state. Yes, AM does not go into KILLED state, it will go to SUCCEEDED instead. I change it go to KILLED in the patch. bq. AM is not shutdown when TezClient::stop is invoked in non-session mode >From the comment, it looks like it is by-design, [~bikassaha], Please help >confirm {code} /** * Stop the client. This terminates the connection to the YARN cluster. * In session mode, this shuts down the session DAG App Master * @throws TezException * @throws IOException */ public synchronized void stop() throws TezException, IOException { {code} bq. Why is system.exit invoked in local mode? This should never happen. system.exit was invoked in LocalClient when it is failed to start DAGAppMaster {code} dagAppMaster = createDAGAppMaster(applicationAttemptId, cId, currentHost, nmPort, nmHttpPort, new SystemClock(), appSubmitTime, isSession, userDir.toUri().getPath()); clientHandler = new DAGClientHandler(dagAppMaster); DAGAppMaster.initAndStartAppMaster(dagAppMaster, currentUser.getShortUserName()); } catch (Throwable t) { LOG.fatal("Error starting DAGAppMaster", t); System.exit(1); } {code} > Refactor DAGAppMaster to state machine based > -------------------------------------------- > > Key: TEZ-1273 > URL: https://issues.apache.org/jira/browse/TEZ-1273 > Project: Apache Tez > Issue Type: Improvement > Affects Versions: 0.4.0 > Reporter: Jeff Zhang > Assignee: Jeff Zhang > Attachments: DAGAppMaster_3.pdf, TEZ-1273-3.patch, TEZ-1273-4.patch, > Tez-1273-2.patch, Tez-1273.patch, dag_app_master.pdf, dag_app_master2.pdf > > > Almost all our entities (Vertex, Task etc) are state machine based and > written using a formal state machine. But DAGAppMaster is not written on a > formal state machine even though it has a state machine based behavior. This > jira is for refactoring it into state machine based -- This message was sent by Atlassian JIRA (v6.3.4#6332)