[ 
https://issues.apache.org/jira/browse/TEZ-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315456#comment-14315456
 ] 

Bikas Saha commented on TEZ-714:
--------------------------------

Thanks! I reviewed the patch on TEZ-992 quickly. It seems that the 
FinishSavingService is complicating things a bit there. Can be avoided. And the 
change can be kept local to DAGImpl and VertexImpl. The callables only need to 
call commit() instead of knowing about DAGImpl internals or calling its 
methods. Let me know what you think. 
Yes, recovery logging could be leveraged similarly. In a separate jira, we 
could try to make the first commit callable write the start commit log and the 
last commit callable write the finish commit log. This way the DAG/Vertex state 
machine need not bother about that part.

> OutputCommitters should not run in the main AM dispatcher thread
> ----------------------------------------------------------------
>
>                 Key: TEZ-714
>                 URL: https://issues.apache.org/jira/browse/TEZ-714
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Siddharth Seth
>            Assignee: Jeff Zhang
>            Priority: Critical
>
> Follow up jira from TEZ-41.
> 1) If there's multiple OutputCommitters on a Vertex, they can be run in 
> parallel.
> 2) Running an OutputCommitter in the main thread blocks all other event 
> handling, w.r.t the DAG, and causes the event queue to back up.
> 3) This should also cover shared commits that happen in the DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to