[
https://issues.apache.org/jira/browse/TEZ-3052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15109736#comment-15109736
]
Jason Lowe commented on TEZ-3052:
---------------------------------
Test failure appears to be unrelated. TestFaultTolerance passes for me locally
with the patch applied.
> Task internal error due to Invalid event: T_ATTEMPT_FAILED at FAILED
> --------------------------------------------------------------------
>
> Key: TEZ-3052
> URL: https://issues.apache.org/jira/browse/TEZ-3052
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Jason Lowe
> Assignee: Jason Lowe
> Attachments: TEZ-3052.001.patch
>
>
> A task encountered an internal error due to "Invalid event: T_ATTEMPT_FAILED
> at FAILED". The task had two outstanding attempts, as one was speculative.
> The main attempt failed causing the task to fail, and when the speculative
> attempt subsequently failed the T_ATTEMPT_FAILED triggered the invalid state
> transition.
> It appears there needs to be some hardening of the TaskImpl state machine in
> light of speculative attempt events arriving late. Besides this scenario I
> think there may be others, e.g.: speculative attempt succeeding just as
> overall task fails appears to be unhandled.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)