[ 
https://issues.apache.org/jira/browse/TEZ-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15001929#comment-15001929
 ] 

Bikas Saha commented on TEZ-2581:
---------------------------------

The API is called taskRecovery but essentially its attemptRecovery in the same 
sense as taskCommit is essentially done by the attempt. In any case, the 
current approach works too. Looking at that code again, I have a couple of 
comments
1) IMO, we should not penalize the task for this because its not the tasks 
fault. This remove the need to duplicate numRetry checking logic.
2) Also, in the current code, the setting of successAttempt etc. is spread 
across the transition body and the recoveryCheck method which is a little 
confusing. Could we change the recoverCheck method to just return if task 
commits got recovered or not. And then take action on that in the main 
transition.

Changing TEZ-2939 jira title and description would be good to reflect the 
intent.

> Umbrella for Tez Recovery Redesign
> ----------------------------------
>
>                 Key: TEZ-2581
>                 URL: https://issues.apache.org/jira/browse/TEZ-2581
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Jeff Zhang
>            Assignee: Jeff Zhang
>         Attachments: TEZ-2581-WIP-1.patch, TEZ-2581-WIP-10.patch, 
> TEZ-2581-WIP-11.patch, TEZ-2581-WIP-2.patch, TEZ-2581-WIP-3.patch, 
> TEZ-2581-WIP-4.patch, TEZ-2581-WIP-5.patch, TEZ-2581-WIP-6.patch, 
> TEZ-2581-WIP-7.patch, TEZ-2581-WIP-8.patch, TEZ-2581-WIP-9.patch, 
> TezRecoveryRedesignProposal.pdf, TezRecoveryRedesignV1.1.pdf
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to