[
https://issues.apache.org/jira/browse/TEZ-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315437#comment-14315437
]
Bikas Saha commented on TEZ-2074:
---------------------------------
[~zjffdu] What do you think? This looks similar to the shared group commit with
vertex rerunning bug.
> TaskRescheduledAfterVertexSuccessTransition may go back to RUNNING incorrectly
> ------------------------------------------------------------------------------
>
> Key: TEZ-2074
> URL: https://issues.apache.org/jira/browse/TEZ-2074
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Bikas Saha
>
> {code} public VertexState transition(VertexImpl vertex, VertexEvent event)
> {
> if (vertex.outputCommitters == null // no committer
> || vertex.outputCommitters.isEmpty() // no committer
> || !vertex.commitVertexOutputs) { // committer does not commit on
> vertex success
> LOG.info(vertex.getLogIdentifier() + " back to running due to
> rescheduling "
> + ((VertexEventTaskReschedule)event).getTaskID());
> (new TaskRescheduledTransition()).transition(vertex, event);
> // inform the DAG that we are re-running
> vertex.eventHandler.handle(new
> DAGEventVertexReRunning(vertex.getVertexId()));
> return VertexState.RUNNING;
> }
> ...
> }
> {code}
> The "// committer does not commit on vertex success" may be wrong because the
> DAG might have completed at this time and the overall commit might be in
> progress. If so, the Vertex and DAG should fail.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)