[ 
https://issues.apache.org/jira/browse/TEZ-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated TEZ-1351:
----------------------------

    Summary: MROutput needs a flush method to ensure data is materialized for 
FileOutputCommitter  (was: Need to invoke isCommitRequired inside 
MROutput.commit())

> MROutput needs a flush method to ensure data is materialized for 
> FileOutputCommitter
> ------------------------------------------------------------------------------------
>
>                 Key: TEZ-1351
>                 URL: https://issues.apache.org/jira/browse/TEZ-1351
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Daniel Dai
>            Assignee: Bikas Saha
>             Fix For: 0.5.0
>
>         Attachments: TEZ-1351.1.patch, TEZ-1351.2.patch
>
>
> In MROutput.commit, we need to check isCommitRequired before invoking 
> commitTask.
> Currently we did this check inside Pig:
> {code}
>                 if (fileOutput.isCommitRequired()) {
>                     fileOutput.commit();
>                 }
> {code}
> However, in some loader, output file is generated only after 
> fileOutput.close, which is part of fileOutput.commit. The isCommitRequired 
> check is too early. A walk around is to invoke fileOutput.close before 
> isCommitRequired:
> {code}
>                 fileOutput.close();
>                 if (fileOutput.isCommitRequired()) {
>                     fileOutput.commit();
>                 }
> {code}
> But we are told there is a plan to make MROutput.close private.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to