[
https://issues.apache.org/jira/browse/TEZ-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jeff Zhang updated TEZ-2589:
----------------------------
Description:
VertexManager is one part of Vertex and it is a user-facing API. Task's
recovery not only depend on Vertex but also on VertexManager. Currently
VertexManager may interact with Vertex with the whole lifecycle of Vertex. This
make the recovery of Vertex/Task pretty complicated. The recovery of
VertexManager is almost of impossible, because it is user-facing API, we don't
have control on that.
Define the completeness could help the recovery of Vertex. The completeness of
VertexManager means it has complete its responsibility and won't interact with
Vertex and used by vertex again which means if VertexManager is completed then
we don't need it in recovery.
Initial idea to represent the completeness of VertexManager.
* If VertexImpl#vertexReconfigurationPlanned is not invoked, 1 condition for
the completeness of VertexManager
** All the tasks are started ( All the TaskStartedEvents are seen)
* If VertexImpl#vertexReconfigurationPlanned is invoked, 2 conditions for the
completeness of VertexManager
** VertexImpl#doneReconfiguringVertex is invoked
** All the tasks are started ( All the TaskStartedEvents are seen)
If VertexManager is in completed state, we can continue the recovery of vertex
based on the recovery events. Otherwise recover the vertex from scratch.
Things may change after TEZ-2103 which may kill tasks after running.
was:
VertexManager is one part of Vertex and it is a user-facing API. Task's
recovery not only depend on Vertex but also on VertexManager. Currently
VertexManager may interact with Vertex with the whole lifecycle of Vertex. This
make the recovery of Vertex/Task pretty complicated. The recovery of
VertexManager is almost of impossible, because it is user-facing API, we don't
have control on that.
Define the completeness could help the recovery of Vertex. The completeness of
VertexManager means it has complete its responsibility and won't interact with
Vertex and used by vertex again which means if VertexManager is completed then
we don't need it in recovery.
Initial idea to represent the completeness of VertexManager.
* If VertexImpl#vertexReconfigurationPlanned is not invoked, 1 condition for
the completeness of VertexManager
** All the tasks are started ( All the TaskStartedEvents are seen)
* If VertexImpl#vertexReconfigurationPlanned is invoked, 2 conditions for the
completeness of VertexManager
** VertexImpl#doneReconfiguringVertex is invoked
** All the tasks are started ( All the TaskStartedEvents are seen)
If VertexManager is in completed state, we can continue the recovery of vertex
based on the recovery events. Otherwise recover the vertex from scratch.
> Define the completeness of VertexManager
> ----------------------------------------
>
> Key: TEZ-2589
> URL: https://issues.apache.org/jira/browse/TEZ-2589
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Jeff Zhang
> Assignee: Jeff Zhang
>
> VertexManager is one part of Vertex and it is a user-facing API. Task's
> recovery not only depend on Vertex but also on VertexManager. Currently
> VertexManager may interact with Vertex with the whole lifecycle of Vertex.
> This make the recovery of Vertex/Task pretty complicated. The recovery of
> VertexManager is almost of impossible, because it is user-facing API, we
> don't have control on that.
> Define the completeness could help the recovery of Vertex. The completeness
> of VertexManager means it has complete its responsibility and won't interact
> with Vertex and used by vertex again which means if VertexManager is
> completed then we don't need it in recovery.
> Initial idea to represent the completeness of VertexManager.
> * If VertexImpl#vertexReconfigurationPlanned is not invoked, 1 condition for
> the completeness of VertexManager
> ** All the tasks are started ( All the TaskStartedEvents are seen)
> * If VertexImpl#vertexReconfigurationPlanned is invoked, 2 conditions for the
> completeness of VertexManager
> ** VertexImpl#doneReconfiguringVertex is invoked
> ** All the tasks are started ( All the TaskStartedEvents are seen)
> If VertexManager is in completed state, we can continue the recovery of
> vertex based on the recovery events. Otherwise recover the vertex from
> scratch.
> Things may change after TEZ-2103 which may kill tasks after running.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)