[
https://issues.apache.org/jira/browse/TEZ-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14252652#comment-14252652
]
Hitesh Shah commented on TEZ-1019:
----------------------------------
bq. In the existing code, we will recover task when vertex's recovered state is
inited, not sure why, I just remove it in the new patch. As my understanding,
if it is in INITED, there should be no task running, we don't need to recover
task here.
There is no guarantee that vertex running event was written in time ( given
that it is not critical ) hence both the vertex start could have occurred as
well tasks starting/finishing.
bq. when vertex's recoveredState is RUNNING, we will still check the numTasks.
As my understanding, numTasks wouldn't been 0 when it is in RUNNING, otherwise
that means init is not completed.
That should be the case in most scenarios. However, with allowing of -1 on 1:1
edges and waiting for an upstream parallelism to be set to define the
downstream vertex parallelism, we may need to verify all such cases. Also, in
case of a parallelism update ( after running ), numTasks need not be set to 0
but this could just be a sanity check to verify the tasks array matches
numTasks.
> Re-factor routing of events to use common code path for normal and recovery
> flow.
> ---------------------------------------------------------------------------------
>
> Key: TEZ-1019
> URL: https://issues.apache.org/jira/browse/TEZ-1019
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Hitesh Shah
> Assignee: Jeff Zhang
> Attachments: TEZ-1019-2.patch, Tez-1019.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)