[ 
https://issues.apache.org/jira/browse/FLINK-2394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14640999#comment-14640999
 ] 

Fabian Hueske commented on FLINK-2394:
--------------------------------------

Hi [~stefano.bortoli], we do already have two HadoopOutputFormatBase classes, 
one for each Hadoop API. So treating both APIs differently is not a problem. 
The issue is that one API supports different OutputCommitters out-of-the-box 
(mapreduce) and the other one requires that the OutputCommitter is explicitly 
set (mapred), unless I overlooked something.

> HadoopOutFormat OutputCommitter is default to FileOutputCommiter
> ----------------------------------------------------------------
>
>                 Key: FLINK-2394
>                 URL: https://issues.apache.org/jira/browse/FLINK-2394
>             Project: Flink
>          Issue Type: Bug
>          Components: Hadoop Compatibility
>    Affects Versions: 0.9.0
>            Reporter: Stefano Bortoli
>
> MongoOutputFormat does not write back in collection because the 
> HadoopOutputFormat wrapper does not allow to set the MongoOutputCommiter and 
> is set as default to FileOutputCommitter. Therefore, on close and 
> globalFinalize execution the commit does not happen and mongo collection 
> stays untouched. 
> A simple solution would be to:
> 1 - create a constructor of HadoopOutputFormatBase and HadoopOutputFormat 
> that gets the OutputCommitter as a parameter
> 2 - change the outputCommitter field of HadoopOutputFormatBase to be a 
> generic OutputCommitter
> 3 - remove the default assignment in the open() and finalizeGlobal to the 
> outputCommitter to FileOutputCommitter(), or keep it as a default in case of 
> no specific assignment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to