[
https://issues.apache.org/jira/browse/TEZ-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14200151#comment-14200151
]
Jeff Zhang commented on TEZ-1734:
---------------------------------
[~hitesh]
bq. Could you provide more details on why this is removed?
Because I think it would be possible for vertex go to failed from new with
recovered events not empty ( Get RootInputFormation from InputIntializer, and
then failed before inited ), otherwise
TestVertexRecovery.testRecovery_RecoveringFromNew2Failed will fail.
bq. Bikas's test case issue.
The reason is that we can not move vertex to running before its parents move to
running. So in the patch I check whether the recoveryStartEventSeen is true, if
it is true, that means it is started, and its parents must also started, in
this case we could move the vertex to running and recover its tasks.
BTW, the recovery process is still complicated to me, plan to do more
refactoring to make it clean and easy maintain.
> Vertex's taskNum may be -1 when recovered from NEW to FAILED/KILLED
> -------------------------------------------------------------------
>
> Key: TEZ-1734
> URL: https://issues.apache.org/jira/browse/TEZ-1734
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.5.1
> Reporter: Jeff Zhang
> Assignee: Jeff Zhang
> Attachments: TEZ-1734-2.patch, TEZ-1734.patch
>
>
> When vertex recovered from NEW to FAILED/KILLED, the taskNum may be -1, in
> this case, we don't need to recover its tasks
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)