[ 
https://issues.apache.org/jira/browse/TEZ-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-2589:
----------------------------
    Description: 
VertexManager is one part of Vertex and it is a user-facing API. Task's 
recovery not only depend on Vertex but also on VertexManager.  Currently 
VertexManager may interact with Vertex within the whole lifecycle of Vertex. 
This make the recovery of Vertex/Task pretty complicated.  The recovery of 
VertexManager is almost  impossible, because it is user-facing API, we don't 
have control on that. 
Define the completeness could help the recovery of Vertex. The completeness of 
VertexManager means it has complete its responsibility and won't interact with 
Vertex and won't be used by vertex again which means if VertexManager is in 
completed state then we don't need it in recovery. 

The following are methods VertexManager interact with Vertex through 
VertexManagerPluginContext. We can classify these methods into 2 types. One is 
for recofigure vertex like change parallelism, source edge manager and etc. 
Another kind is for scheduling tasks.  If VertexManager is in completed state, 
that means these methods won't be called again.
* setVertexParallelism
* reconfigureVertex
* vertexReconfigurationPlanned
* doneReconfiguringVertex
* scheduleVertexTasks



Initial idea to represent the completeness of VertexManager. 
* If VertexImpl#vertexReconfigurationPlanned is not invoked, 1 condition for 
the completeness of VertexManager
** All the tasks are started ( All TaskStartedEvents are seen, otherwise we 
can't guratteen VertexManager will schedule tasks the same as last AM attempt). 
That means VertexManager won't call scheduleTasks again. 
* If VertexImpl#vertexReconfigurationPlanned is invoked, 2 conditions for the 
completeness of VertexManager
** VertexImpl#doneReconfiguringVertex is invoked
** All the tasks are started ( All TaskStartedEvents are seen, otherwise we 
can't guratteen VertexManager will schedule tasks the same as last AM attempt), 
That means VertexManager won't call scheduleTasks again. 

If VertexManager is in completed state, we can continue the recovery of vertex 
based on the recovery events. Otherwise recover the vertex from scratch. 

Things may change after TEZ-2103 which may kill tasks after running. 


  was:
VertexManager is one part of Vertex and it is a user-facing API. Task's 
recovery not only depend on Vertex but also on VertexManager.  Currently 
VertexManager may interact with Vertex within the whole lifecycle of Vertex. 
This make the recovery of Vertex/Task pretty complicated.  The recovery of 
VertexManager is almost  impossible, because it is user-facing API, we don't 
have control on that. 
Define the completeness could help the recovery of Vertex. The completeness of 
VertexManager means it has complete its responsibility and won't interact with 
Vertex and won't be used by vertex again which means if VertexManager is in 
completed state then we don't need it in recovery. 

The following are methods VertexManager interact with Vertex through 
VertexManagerPluginContext
* setVertexParallelism
* reconfigureVertex
* vertexReconfigurationPlanned
* doneReconfiguringVertex
* scheduleVertexTasks

If VertexManager is in completed state, that means these methods won't be 
called again.

Initial idea to represent the completeness of VertexManager. 
* If VertexImpl#vertexReconfigurationPlanned is not invoked, 1 condition for 
the completeness of VertexManager
** All the tasks are started ( All TaskStartedEvents are seen, otherwise we 
can't guratteen VertexManager will schedule tasks the same as last AM attempt). 
That means VertexManager won't call scheduleTasks again. 
* If VertexImpl#vertexReconfigurationPlanned is invoked, 2 conditions for the 
completeness of VertexManager
** VertexImpl#doneReconfiguringVertex is invoked
** All the tasks are started ( All TaskStartedEvents are seen, otherwise we 
can't guratteen VertexManager will schedule tasks the same as last AM attempt), 
That means VertexManager won't call scheduleTasks again. 

If VertexManager is in completed state, we can continue the recovery of vertex 
based on the recovery events. Otherwise recover the vertex from scratch. 

Things may change after TEZ-2103 which may kill tasks after running. 



> Define the completeness of VertexManager
> ----------------------------------------
>
>                 Key: TEZ-2589
>                 URL: https://issues.apache.org/jira/browse/TEZ-2589
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Jeff Zhang
>            Assignee: Jeff Zhang
>
> VertexManager is one part of Vertex and it is a user-facing API. Task's 
> recovery not only depend on Vertex but also on VertexManager.  Currently 
> VertexManager may interact with Vertex within the whole lifecycle of Vertex. 
> This make the recovery of Vertex/Task pretty complicated.  The recovery of 
> VertexManager is almost  impossible, because it is user-facing API, we don't 
> have control on that. 
> Define the completeness could help the recovery of Vertex. The completeness 
> of VertexManager means it has complete its responsibility and won't interact 
> with Vertex and won't be used by vertex again which means if VertexManager is 
> in completed state then we don't need it in recovery. 
> The following are methods VertexManager interact with Vertex through 
> VertexManagerPluginContext. We can classify these methods into 2 types. One 
> is for recofigure vertex like change parallelism, source edge manager and 
> etc. Another kind is for scheduling tasks.  If VertexManager is in completed 
> state, that means these methods won't be called again.
> * setVertexParallelism
> * reconfigureVertex
> * vertexReconfigurationPlanned
> * doneReconfiguringVertex
> * scheduleVertexTasks
> Initial idea to represent the completeness of VertexManager. 
> * If VertexImpl#vertexReconfigurationPlanned is not invoked, 1 condition for 
> the completeness of VertexManager
> ** All the tasks are started ( All TaskStartedEvents are seen, otherwise we 
> can't guratteen VertexManager will schedule tasks the same as last AM 
> attempt). That means VertexManager won't call scheduleTasks again. 
> * If VertexImpl#vertexReconfigurationPlanned is invoked, 2 conditions for the 
> completeness of VertexManager
> ** VertexImpl#doneReconfiguringVertex is invoked
> ** All the tasks are started ( All TaskStartedEvents are seen, otherwise we 
> can't guratteen VertexManager will schedule tasks the same as last AM 
> attempt), That means VertexManager won't call scheduleTasks again. 
> If VertexManager is in completed state, we can continue the recovery of 
> vertex based on the recovery events. Otherwise recover the vertex from 
> scratch. 
> Things may change after TEZ-2103 which may kill tasks after running. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to