[ 
https://issues.apache.org/jira/browse/TEZ-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386113#comment-14386113
 ] 

Jeff Zhang edited comment on TEZ-714 at 3/30/15 2:47 AM:
---------------------------------------------------------

Regarding the partial output, After second thought, I think we should only 
consider it as vertex basis rather than output basis. That means either one 
vertex's all outputs commits successfully or abort all.  I think the purpose of 
TEZ_AM_COMMIT_ALL_OUTPUTS_ON_DAG_SUCCESS is to allow external system to check 
the intermediate vertex's output at vertex level. Say if one vertex has 2 
outputs and TEZ_AM_COMMIT_ALL_OUTPUTS_ON_DAG_SUCCESS is false, and the first 
commit succeeded but the second commit fails, then we should abort both of them 
and mark this vertex to failed state. And it would be weird that if one vertex 
go to FAILED with one commit aborted while another commit is not aborted


was (Author: zjffdu):
Regarding the partial output, After second thought, I think we should only 
consider it as vertex basis rather than output basis. That means either one 
vertex's all outputs commits successfully or abort all.  I think the purpose of 
TEZ_AM_COMMIT_ALL_OUTPUTS_ON_DAG_SUCCESS is to allow external system to check 
the intermediate vertex's output at vertex level. Say if one vertex has 2 
outputs and TEZ_AM_COMMIT_ALL_OUTPUTS_ON_DAG_SUCCESS is false, and the first 
commit succeeded but the second commit fails, then we should abort both of them 
and mark this vertex to failed state. 

> OutputCommitters should not run in the main AM dispatcher thread
> ----------------------------------------------------------------
>
>                 Key: TEZ-714
>                 URL: https://issues.apache.org/jira/browse/TEZ-714
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Siddharth Seth
>            Assignee: Jeff Zhang
>            Priority: Critical
>         Attachments: DAG_2.pdf, TEZ-714-1.patch, TEZ-714-2.patch, 
> TEZ-714-3.patch, TEZ-714-4.patch, TEZ-714-5.patch, Vertex_2.pdf
>
>
> Follow up jira from TEZ-41.
> 1) If there's multiple OutputCommitters on a Vertex, they can be run in 
> parallel.
> 2) Running an OutputCommitter in the main thread blocks all other event 
> handling, w.r.t the DAG, and causes the event queue to back up.
> 3) This should also cover shared commits that happen in the DAG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to