[ 
https://issues.apache.org/jira/browse/TEZ-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14976378#comment-14976378
 ] 

Bikas Saha commented on TEZ-2581:
---------------------------------

Haven't looked at the test files yet. Would be good to get an overview of the 
test plan here. I see that there is some randomized testing via the ChaosMonkey 
code. However, it would be useful to have some specific tightly defined cases 
like those in TestFaultTolerance. Would be good to outline the test matrix for 
such specific cases. E.g. Use a 3 level DAG with each level either succeeded or 
running when the AM restarts. That gives 8 combinations. Plus have different 
edge types between the vertices so that we can verify that their corresponding 
vertex managers dont get hung on recovery. That adds more combinations to the 
above matrix.

> Umbrella for Tez Recovery Redesign
> ----------------------------------
>
>                 Key: TEZ-2581
>                 URL: https://issues.apache.org/jira/browse/TEZ-2581
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Jeff Zhang
>            Assignee: Jeff Zhang
>         Attachments: TEZ-2581-WIP-1.patch, TEZ-2581-WIP-2.patch, 
> TEZ-2581-WIP-3.patch, TEZ-2581-WIP-4.patch, TEZ-2581-WIP-5.patch, 
> TEZ-2581-WIP-6.patch, TezRecoveryRedesignProposal.pdf, 
> TezRecoveryRedesignV1.1.pdf
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to