[jira] [Commented] (FLINK-1038) Adding a collection output format

Fabian Tschirschnitz (JIRA) Thu, 14 Aug 2014 07:29:18 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-1038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14097007#comment-14097007
 ]


Fabian Tschirschnitz commented on FLINK-1038:
---------------------------------------------

Hello, 

I created PullRequest https://github.com/apache/incubator-flink/pull/94 for 
this. An usage example could look like the following: 
http://pastebin.com/PkcWQRkh.
Happy for feedback.

Cheers, Fabian Tschirschnitz (from HPI).



> Adding a collection output format
> ---------------------------------
>
>                 Key: FLINK-1038
>                 URL: https://issues.apache.org/jira/browse/FLINK-1038
>             Project: Flink
>          Issue Type: Improvement
>            Reporter: Sebastian Kruse
>            Priority: Minor
>
> Similar to the existing LocalCollectionOutputFormat or Spark's collect() 
> method, it would be nice to have a CollectionOutputFormat that also works 
> when running jobs on a cluster. This output format gathers all results of a 
> sink from all TaskManagers in the JVM that submitted the job plan and 
> provides these as a collection, similar to accumulators. After all, this can 
> help to avoid the tedious task of going to HDFS and read and parse the single 
> result files.
> PS. We have already created such an output format and can contribute it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (FLINK-1038) Adding a collection output format

Reply via email to