[
https://issues.apache.org/jira/browse/FLINK-6185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15941736#comment-15941736
]
Greg Hogan commented on FLINK-6185:
-----------------------------------
There is some [support for this
already|https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/batch/index.html#read-compressed-files].
I would expect output compression to be similar to reading compressed input
(see {{InflaterInputStreamFactory}}) and parallelism is not an issue. Is this a
feature you would like to work on?
> Input readers and output writers/formats need to support gzip
> -------------------------------------------------------------
>
> Key: FLINK-6185
> URL: https://issues.apache.org/jira/browse/FLINK-6185
> Project: Flink
> Issue Type: Bug
> Components: Core
> Affects Versions: 1.2.0
> Reporter: Luke Hutchison
> Priority: Minor
>
> File sources (such as {{ExecutionEnvironment#readCsvFile()}}) and sinks (such
> as {{FileOutputFormat}} and its subclasses, and methods such as
> {{DataSet#writeAsText()}}) need the ability to transparently decompress and
> compress files. Primarily gzip would be useful, but it would be nice if this
> were pluggable to support bzip2, xz, etc.
> There could be options for autodetect (based on file extension and/or file
> content), which could be the default, as well as no compression or a selected
> compression method.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)