[
https://issues.apache.org/jira/browse/FLINK-11737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16777496#comment-16777496
]
vinoyang commented on FLINK-11737:
----------------------------------
[~StephanEwen] updated. Constructing
{{org.apache.hadoop.mapreduce.lib.output.MultipleOutputs}} in hadoop requires
an instance of the {{TaskInputOutputContext}} interface, and the most common
implementation of this interface is {{ReduceContextImpl}}. The Construction of
{{ReduceContextImpl}} requires {{RawKeyValueIterator}} (requires an Iterator).
The lowest-level {{OutputFormat}} in Flink is a single message output model
(OutputFormat#writeRecord). Currently, to use {{MultipleOutputs}}, I can only
use an {{MapPartitionFunction}} to get an {{Iterator}}. What do you think of
this issue? cc [~fhueske]
> Support org.apache.hadoop.mapreduce.lib.output.MultipleOutputs output
> ---------------------------------------------------------------------
>
> Key: FLINK-11737
> URL: https://issues.apache.org/jira/browse/FLINK-11737
> Project: Flink
> Issue Type: Improvement
> Components: Batch Connectors and Input/Output Formats
> Reporter: vinoyang
> Assignee: vinoyang
> Priority: Major
>
> This issue is to improve Flink's compatibility with Hadoop. Currently, for
> the old version of the Hadoop API, there is
> {{org.apache.hadoop.mapred.lib.MultipleOutputFormat}}, which can be used
> directly. However, for the new version of the Hadoop API
> {{org.apache.hadoop.mapreduce.lib.output.MultipleOutputs}}, the current Flink
> cannot be supported.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)