[
https://issues.apache.org/jira/browse/TEZ-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350167#comment-14350167
]
Jeff Zhang commented on TEZ-714:
--------------------------------
* Vertex Level Commit
Add one new state COMMITTING in the state machine of Vertex, and also wrap
commit and abort in CallableEvent. No ABORTING state, Vertex would be in
TERMINTATING state if it is in aborting. Abort is still inline invoked in
InternalErrorTransition
Upload a initial patch and the new state machine diagram, [~bikassaha] Please
help review.
* DAG Commit
Add one new date COMMITTING in the state machine of DAG, and also wrap commit
and abort in CallableEvent like Vertex. Abort is still inline invoked in
InternalErrorTransition
* Vertex Group Commit
Remember the vertex group commit count, and move to succeeded if all the vertex
group commits are done.
* Unit test
Regarding unit test. Currently I leverage the CallableEvent and make the
committer to run in the central dispatcher thread. But this don't consistent
with the real behavior and may hide some potential bugs. I am thinking of ways
to make it run output committer in separated thread in unit test too. And Will
add more unit test in the follow up patch.
> OutputCommitters should not run in the main AM dispatcher thread
> ----------------------------------------------------------------
>
> Key: TEZ-714
> URL: https://issues.apache.org/jira/browse/TEZ-714
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Siddharth Seth
> Assignee: Jeff Zhang
> Priority: Critical
>
> Follow up jira from TEZ-41.
> 1) If there's multiple OutputCommitters on a Vertex, they can be run in
> parallel.
> 2) Running an OutputCommitter in the main thread blocks all other event
> handling, w.r.t the DAG, and causes the event queue to back up.
> 3) This should also cover shared commits that happen in the DAG.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)