[ 
https://issues.apache.org/jira/browse/TEZ-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14979717#comment-14979717
 ] 

Bikas Saha commented on TEZ-2581:
---------------------------------

bq. should be "vertexData.getVertexFinishedEvent() == null", will fix it.
And its still working :) That means this code is either not relevant, or there 
is a bug. We have a missing test case that would exercise this code path.

isDAGRecoverable() & isRecoverable() - just changing names to Summary and 
NonSummary should be enough. Also an uber comment explaining the flow like - 1) 
read file 2) check summary recover 3) check non-summary recover - would help in 
understanding the flow.

bq. ecause it is not known whether this vertex belong to any vertex group when 
parsing recovery logs. So here check both vertex level commit and vertex group 
level commit.
Could you please explain a little more? From what I understand the we check if 
there were any in-progress commit operations. They can be 1) vertex commit 
(either after vertex completion or dag completion) 2) group commit (either 
after vertex completion or dag completion). Both of these have recovery logs. 
If these are found but their corresponding finished logs are not found then we 
can error out right? Then why do we need to look at individual members of a 
group?

> Umbrella for Tez Recovery Redesign
> ----------------------------------
>
>                 Key: TEZ-2581
>                 URL: https://issues.apache.org/jira/browse/TEZ-2581
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Jeff Zhang
>            Assignee: Jeff Zhang
>         Attachments: TEZ-2581-WIP-1.patch, TEZ-2581-WIP-2.patch, 
> TEZ-2581-WIP-3.patch, TEZ-2581-WIP-4.patch, TEZ-2581-WIP-5.patch, 
> TEZ-2581-WIP-6.patch, TezRecoveryRedesignProposal.pdf, 
> TezRecoveryRedesignV1.1.pdf
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to