[
https://issues.apache.org/jira/browse/TEZ-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15259407#comment-15259407
]
Jeff Zhang commented on TEZ-3213:
---------------------------------
Thanks [~ebadger]. The patch lgtm.
It is required for DAGImpl/VertexImpl to handle InternalError for all the
states, otherwise it would cause infinite loop as you found. But for
TaskImpl/TaskAttemptImpl, it is not necessary because there's no INTERNAL_ERROR
event for them and when InvalidEventTransition happens in
TaskImpl/TaskAttemptImpl, it would send DAGEventType.INTERNAL_ERROR which would
cause DAG move to ERROR.
> Uncaught exception during vertex recovery leads to invalid state transition
> loop
> --------------------------------------------------------------------------------
>
> Key: TEZ-3213
> URL: https://issues.apache.org/jira/browse/TEZ-3213
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Jason Lowe
> Assignee: Eric Badger
> Attachments: TEZ-3213-b0.7.001.patch
>
>
> If an uncaught exception occurs during a state transition from the RECOVERING
> vertex then V_INTERNAL_ERROR will be delivered to the state machine, but that
> event is not handled in the RECOVERING state. That in turn causes a
> V_INTERNAL_ERROR event to be delivered to the state machine, and it loops
> logging the invalid transitions.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)