[ 
https://issues.apache.org/jira/browse/TEZ-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14252652#comment-14252652
 ] 

Hitesh Shah commented on TEZ-1019:
----------------------------------

bq. In the existing code, we will recover task when vertex's recovered state is 
inited, not sure why, I just remove it in the new patch. As my understanding, 
if it is in INITED, there should be no task running, we don't need to recover 
task here. 
There is no guarantee that vertex running event was written in time ( given 
that it is not critical ) hence both the vertex start could have occurred as 
well tasks starting/finishing. 

bq. when vertex's recoveredState is RUNNING, we will still check the numTasks. 
As my understanding, numTasks wouldn't been 0 when it is in RUNNING, otherwise 
that means init is not completed.
That should be the case in most scenarios. However, with allowing of -1 on 1:1 
edges and waiting for an upstream parallelism to be set to define the 
downstream vertex parallelism, we may need to verify all such cases. Also, in 
case of a parallelism update ( after running ), numTasks need not be set to 0 
but this could just be a sanity check to verify the tasks array matches 
numTasks.



> Re-factor routing of events to use common code path for normal and recovery 
> flow.
> ---------------------------------------------------------------------------------
>
>                 Key: TEZ-1019
>                 URL: https://issues.apache.org/jira/browse/TEZ-1019
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Hitesh Shah
>            Assignee: Jeff Zhang
>         Attachments: TEZ-1019-2.patch, Tez-1019.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to