[
https://issues.apache.org/jira/browse/TEZ-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14976378#comment-14976378
]
Bikas Saha commented on TEZ-2581:
---------------------------------
Haven't looked at the test files yet. Would be good to get an overview of the
test plan here. I see that there is some randomized testing via the ChaosMonkey
code. However, it would be useful to have some specific tightly defined cases
like those in TestFaultTolerance. Would be good to outline the test matrix for
such specific cases. E.g. Use a 3 level DAG with each level either succeeded or
running when the AM restarts. That gives 8 combinations. Plus have different
edge types between the vertices so that we can verify that their corresponding
vertex managers dont get hung on recovery. That adds more combinations to the
above matrix.
> Umbrella for Tez Recovery Redesign
> ----------------------------------
>
> Key: TEZ-2581
> URL: https://issues.apache.org/jira/browse/TEZ-2581
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Jeff Zhang
> Assignee: Jeff Zhang
> Attachments: TEZ-2581-WIP-1.patch, TEZ-2581-WIP-2.patch,
> TEZ-2581-WIP-3.patch, TEZ-2581-WIP-4.patch, TEZ-2581-WIP-5.patch,
> TEZ-2581-WIP-6.patch, TezRecoveryRedesignProposal.pdf,
> TezRecoveryRedesignV1.1.pdf
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)