[ 
https://issues.apache.org/jira/browse/TEZ-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14366878#comment-14366878
 ] 

Jeff Zhang commented on TEZ-714:
--------------------------------

[~bikassaha] I think the biggest issue in my patch is the granularity of the 
committer thread. Currently I take it as vertex/dag level, but I think it 
should be one OutputCommitter per thread. I will update the patch later.

For the other parts of the patch, here's more description of my patch, hope it 
can clarify the my patch.

* VertexImpl.
** Main change is in checkVertexForCompletion where commit will happen. I 
change it to async commit by wrapping it into CallableEvent and submit it to 
Shared Thread Pool. Here introduce new State COMMITTING which repsent vertex is 
in committing. 
** Also make the abort operation as async operation. No new state is introduced 
here, if Vertex is in aborting, then it is in state of TERMINATING.

** DAGImpl
** Main change is in checkDAGForCompletion() where dag commit will happen and 
vertexSucceeded() where vertex group commit will happen.  And like VertexImpl, 
I aslo wrap the dag commit and vertex group commit into CallableEvent and 
submit to shared thread pool.  Here also introduce new state COMMITTING which 
represent that all the vertices are done but still some committing(dag commit 
or vertex group commit) are not yet completed.
** Like the VertexImpl, if the dag is in aborting , then it is in state of 
TERMINATING.


            

> OutputCommitters should not run in the main AM dispatcher thread
> ----------------------------------------------------------------
>
>                 Key: TEZ-714
>                 URL: https://issues.apache.org/jira/browse/TEZ-714
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Siddharth Seth
>            Assignee: Jeff Zhang
>            Priority: Critical
>         Attachments: DAG_2.pdf, TEZ-714-1.patch, Vertex_2.pdf
>
>
> Follow up jira from TEZ-41.
> 1) If there's multiple OutputCommitters on a Vertex, they can be run in 
> parallel.
> 2) Running an OutputCommitter in the main thread blocks all other event 
> handling, w.r.t the DAG, and causes the event queue to back up.
> 3) This should also cover shared commits that happen in the DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to