[ 
https://issues.apache.org/jira/browse/TEZ-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14065514#comment-14065514
 ] 

Jeff Zhang commented on TEZ-1273:
---------------------------------

Update the state machine and attach the patch

changes of state machine:
* Separate session and non-session into different states. 
     Session States: SESSION_IDLE, SESSION_RUNNING
     Non-Session States: DAG_RUNNING, DAG_SUCCEED, DAG_KILLED, DAG_FAILED
* Add an intermediate state for terminate: TERMINATING. Separate the terminate 
into 2 transitions: START_TERMINATE, FINAL_TERMINATE. 
** In the START_TERMINATE, it will whether there's DAG running, send DAG_KILL 
first if there's one, otherwise, go to FINAL_TERMINATE transition. In this 
stage, we could decide whether could do cleanup. 
    Cleanup case : kill from client side 
    No-cleanup case: INTERNAL_ERROR, AM_REBOOT
    ( Won't do cleanup in the ShutDownhook )

** In the FINAL_TERMINATE, it could do the cleanup if necessary (leave it as 
placeholder, there's another ticket tracking for this ) and stop all the 
services. 


Run the MRRSleep in local cluster successfully in session mode, non-session 
mode and with recovering. 

[~hitesh] Please help review it, will add Unit Test later. 



 


> Refactor DAGAppMaster to state machine based
> --------------------------------------------
>
>                 Key: TEZ-1273
>                 URL: https://issues.apache.org/jira/browse/TEZ-1273
>             Project: Apache Tez
>          Issue Type: Improvement
>    Affects Versions: 0.4.0
>            Reporter: Jeff Zhang
>            Assignee: Jeff Zhang
>         Attachments: Tez-1273.patch, dag_app_master.pdf, dag_app_master2.pdf
>
>
> Almost all our entities (Vertex, Task etc) are state machine based and 
> written using a formal state machine. But DAGAppMaster is not written on a 
> formal state machine even though it has a state machine based behavior. This 
> jira is for refactoring it into state machine based



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to