[ 
https://issues.apache.org/jira/browse/TEZ-3052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15109736#comment-15109736
 ] 

Jason Lowe commented on TEZ-3052:
---------------------------------

Test failure appears to be unrelated.  TestFaultTolerance passes for me locally 
with the patch applied.

> Task internal error due to Invalid event: T_ATTEMPT_FAILED at FAILED
> --------------------------------------------------------------------
>
>                 Key: TEZ-3052
>                 URL: https://issues.apache.org/jira/browse/TEZ-3052
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: TEZ-3052.001.patch
>
>
> A task encountered an internal error due to "Invalid event: T_ATTEMPT_FAILED 
> at FAILED".  The task had two outstanding attempts, as one was speculative.  
> The main attempt failed causing the task to fail, and when the speculative 
> attempt subsequently failed the T_ATTEMPT_FAILED triggered the invalid state 
> transition.
> It appears there needs to be some hardening of the TaskImpl state machine in 
> light of speculative attempt events arriving late.  Besides this scenario I 
> think there may be others, e.g.: speculative attempt succeeding just as 
> overall task fails  appears to be unhandled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to