[
https://issues.apache.org/jira/browse/TEZ-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bikas Saha updated TEZ-1351:
----------------------------
Summary: MROutput needs a flush method to ensure data is materialized for
FileOutputCommitter (was: Need to invoke isCommitRequired inside
MROutput.commit())
> MROutput needs a flush method to ensure data is materialized for
> FileOutputCommitter
> ------------------------------------------------------------------------------------
>
> Key: TEZ-1351
> URL: https://issues.apache.org/jira/browse/TEZ-1351
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Daniel Dai
> Assignee: Bikas Saha
> Fix For: 0.5.0
>
> Attachments: TEZ-1351.1.patch, TEZ-1351.2.patch
>
>
> In MROutput.commit, we need to check isCommitRequired before invoking
> commitTask.
> Currently we did this check inside Pig:
> {code}
> if (fileOutput.isCommitRequired()) {
> fileOutput.commit();
> }
> {code}
> However, in some loader, output file is generated only after
> fileOutput.close, which is part of fileOutput.commit. The isCommitRequired
> check is too early. A walk around is to invoke fileOutput.close before
> isCommitRequired:
> {code}
> fileOutput.close();
> if (fileOutput.isCommitRequired()) {
> fileOutput.commit();
> }
> {code}
> But we are told there is a plan to make MROutput.close private.
--
This message was sent by Atlassian JIRA
(v6.2#6252)