[ 
https://issues.apache.org/jira/browse/TEZ-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350167#comment-14350167
 ] 

Jeff Zhang commented on TEZ-714:
--------------------------------


* Vertex Level Commit
Add one new state COMMITTING in the state machine of Vertex, and also wrap 
commit and abort in CallableEvent. No ABORTING state, Vertex would be in 
TERMINTATING state if it is in aborting. Abort is still inline invoked in 
InternalErrorTransition
Upload a initial patch and the new state machine diagram, [~bikassaha] Please 
help review.

* DAG Commit
Add one new date COMMITTING in the state machine of DAG, and also wrap commit 
and abort in CallableEvent like Vertex. Abort is still inline invoked in 
InternalErrorTransition

* Vertex Group Commit
Remember the vertex group commit count, and move to succeeded if all the vertex 
group commits are done.

* Unit test
Regarding unit test. Currently I leverage the CallableEvent and make the 
committer to run in the central dispatcher thread. But this don't consistent 
with the real behavior and may hide some potential bugs. I am thinking of ways 
to make it run output committer in separated thread in unit test too. And Will 
add more unit test in the follow up patch.

> OutputCommitters should not run in the main AM dispatcher thread
> ----------------------------------------------------------------
>
>                 Key: TEZ-714
>                 URL: https://issues.apache.org/jira/browse/TEZ-714
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Siddharth Seth
>            Assignee: Jeff Zhang
>            Priority: Critical
>
> Follow up jira from TEZ-41.
> 1) If there's multiple OutputCommitters on a Vertex, they can be run in 
> parallel.
> 2) Running an OutputCommitter in the main thread blocks all other event 
> handling, w.r.t the DAG, and causes the event queue to back up.
> 3) This should also cover shared commits that happen in the DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to