[ 
https://issues.apache.org/jira/browse/TEZ-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14200151#comment-14200151
 ] 

Jeff Zhang commented on TEZ-1734:
---------------------------------

[~hitesh]

bq. Could you provide more details on why this is removed?
Because I think it would be possible for vertex go to failed from new with 
recovered events not empty ( Get RootInputFormation from InputIntializer, and 
then failed before inited ), otherwise 
TestVertexRecovery.testRecovery_RecoveringFromNew2Failed will fail.

bq. Bikas's test case issue.
The reason is that we can not move vertex to running before its parents move to 
running. So in the patch I check whether the recoveryStartEventSeen is true, if 
it is true, that means it is started, and its parents must also started, in 
this case we could move the vertex to running and recover its tasks.

BTW, the recovery process is still complicated to me, plan to do more 
refactoring to make it clean and easy maintain. 


> Vertex's taskNum may be -1 when recovered from NEW to FAILED/KILLED
> -------------------------------------------------------------------
>
>                 Key: TEZ-1734
>                 URL: https://issues.apache.org/jira/browse/TEZ-1734
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.5.1
>            Reporter: Jeff Zhang
>            Assignee: Jeff Zhang
>         Attachments: TEZ-1734-2.patch, TEZ-1734.patch
>
>
> When vertex recovered from NEW to FAILED/KILLED, the taskNum may be -1, in 
> this case, we don't need to recover its tasks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to