[
https://issues.apache.org/jira/browse/TEZ-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14992880#comment-14992880
]
Jeff Zhang commented on TEZ-2581:
---------------------------------
Right, we need to way to differentiate the 2 cases.
* case 1: from -1 to numTasks,
* case 2: from numTask1 to numTasks2
Currently I create a new flag in VertexReconfigureDoneEvent to differentiate
these 2 cases (vertexReconfigurePlanned is called means the second case,
correct me if I am wrong)
And in recovery, based on this flag to decide where to restore the vertex status
* If this flag is false, restore the data in init stage
(Vertex#assignVertexManager)
* If this flag is true, restore the data in running stage
(VertexManagerPlugin#onVertexStated)
Regarding your method, 2 concerns
>>> always call reconfigurationPlanned() in VM.initialize().
This change the behavior from last AM attempt. Might bring in risk for the next
recovery (AM crash again)
>>> If numTasks < 0 then it has to fake a trigger by setting up a timer.
Setting up a timer looks a little complicated to me. It bring extra behavior in
recovery.
[~bikassaha] Any concern about my current way described above ?
> Umbrella for Tez Recovery Redesign
> ----------------------------------
>
> Key: TEZ-2581
> URL: https://issues.apache.org/jira/browse/TEZ-2581
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Jeff Zhang
> Assignee: Jeff Zhang
> Attachments: TEZ-2581-WIP-1.patch, TEZ-2581-WIP-2.patch,
> TEZ-2581-WIP-3.patch, TEZ-2581-WIP-4.patch, TEZ-2581-WIP-5.patch,
> TEZ-2581-WIP-6.patch, TEZ-2581-WIP-7.patch, TEZ-2581-WIP-8.patch,
> TEZ-2581-WIP-9.patch, TezRecoveryRedesignProposal.pdf,
> TezRecoveryRedesignV1.1.pdf
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)