[
https://issues.apache.org/jira/browse/TEZ-2412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14528986#comment-14528986
]
Bikas Saha commented on TEZ-2412:
---------------------------------
Instead of doing this, should we call checkVerticesForCompletion() in the else
block. This way this transition will not go out of sync with any future checks
added in checkVerticesforCompletion(). E.g. today the check is
completedCount=TotalCount but later it could be different.
{code}
} else {
- if (!dag.commitFutures.isEmpty()) {
- // pending commits are running
+ if (!dag.commitFutures.isEmpty() || dag.numCompletedVertices !=
dag.numVertices) {
+ // pending commits are running or still some vertices are not completed
return DAGState.TERMINATING;
} else {
return finishWithTerminationCause(dag); <<<<<< replace this with
checkVerticesForCompletion. So if there is an error with terminationCause set,
then this transition only changes the commitFutures and uses common logic to
handle the other checks for completion on error.
{code}
Similarly this could be changed
{code}
if (dag.terminationCause == null) {
Preconditions.checkState(dag.getState() == DAGState.COMMITTING,
"DAG should be in COMMITTING state, but in " + dag.getState());
if (!dag.commitFutures.isEmpty()) {
// pending commits are running
return DAGState.COMMITTING;
} else {
return dag.finished(DAGState.SUCCEEDED); <<< return
checkVerticesForCompletion() which does the same thing.
}{code}
Thoughts?
If you dont think this is the correct approach, then we can commit the current
patch with a comment in checkVerticesForCompletion warning that changes in that
transition require an inspection/update of checkCommitsForCompletion for
similar changes.
> Should kill vertex in DAGImpl#VertexRerunWhileCommitting
> --------------------------------------------------------
>
> Key: TEZ-2412
> URL: https://issues.apache.org/jira/browse/TEZ-2412
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Jeff Zhang
> Assignee: Jeff Zhang
> Fix For: 0.7.0
>
> Attachments: TEZ-2412-1.patch
>
>
> * When vertex rerun, it move to RUNNING state, so should kill it in
> DAGImpl#VertexRerunWhileCommitting
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)