[ 
https://issues.apache.org/jira/browse/TEZ-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315437#comment-14315437
 ] 

Bikas Saha commented on TEZ-2074:
---------------------------------

[~zjffdu] What do you think? This looks similar to the shared group commit with 
vertex rerunning bug.

> TaskRescheduledAfterVertexSuccessTransition may go back to RUNNING incorrectly
> ------------------------------------------------------------------------------
>
>                 Key: TEZ-2074
>                 URL: https://issues.apache.org/jira/browse/TEZ-2074
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Bikas Saha
>
> {code}    public VertexState transition(VertexImpl vertex, VertexEvent event) 
> {
>       if (vertex.outputCommitters == null // no committer
>           || vertex.outputCommitters.isEmpty() // no committer
>           || !vertex.commitVertexOutputs) { // committer does not commit on 
> vertex success
>         LOG.info(vertex.getLogIdentifier() + " back to running due to 
> rescheduling "
>             + ((VertexEventTaskReschedule)event).getTaskID());
>         (new TaskRescheduledTransition()).transition(vertex, event);
>         // inform the DAG that we are re-running
>         vertex.eventHandler.handle(new 
> DAGEventVertexReRunning(vertex.getVertexId()));
>         return VertexState.RUNNING;
>       }
>       ...
>     }
>  {code}
> The "// committer does not commit on vertex success" may be wrong because the 
> DAG might have completed at this time and the overall commit might be in 
> progress. If so, the Vertex and DAG should fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to