GitHub user fhueske opened a pull request:

    https://github.com/apache/flink/pull/1056

    [FLINK-2394] [fix] HadoopOutputFormats use correct OutputCommitters.

    Right now, Flink's wrappers for Hadoop OutputFormats always use a 
`FileOutputCommitter`.
    
    - In the `mapreduce` API, Hadoop OutputFormats have a method 
`getOutputCommitter()` which can be overwritten and returns the 
`FileOutputFormat` by default.
    - In the `mapred`API, the `OutputCommitter` should be obtained from the 
`JobConf`. If nothing custom is set, a `FileOutputCommitter` is returned.
    
    This PR uses the respective methods to obtain the correct 
`OutputCommitter`. Since, `FileOutputCommitter` is the default in both cases, 
the original semantics are preserved if no custom committer is implemented or 
set by the user.
    I also added convenience methods to the constructors of the `mapred` 
wrappers to set the `OutputCommitter` in the `JobConf`.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/fhueske/flink hadoopOutCommitter

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/1056.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1056
    
----
commit a632203a948f2e7973339a0eab88750f7ce70cc5
Author: Fabian Hueske <[email protected]>
Date:   2015-07-30T19:47:01Z

    [FLINK-2394] [fix] HadoopOutputFormats use correct OutputCommitters.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to