[
https://issues.apache.org/jira/browse/FLINK-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16050508#comment-16050508
]
ASF GitHub Bot commented on FLINK-6773:
---------------------------------------
GitHub user StefanRRichter opened a pull request:
https://github.com/apache/flink/pull/4130
[FLINK-6773] [checkpoint] Introduce compression (snappy) for keyed st…
This PR introduce optional snappy compression for the keyed state in full
checkpoints and savepoints. This feature can be activated through a flag in
{{ExecutionConfig}}.
For the future, we can also support user-defined compression schemes, which
will also require a upgrade and compatibility feature, as described in
FLINK-6931.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/StefanRRichter/flink compressedKeyGroups
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/4130.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #4130
----
----
> Use compression (e.g. snappy) for full check/savepoints
> -------------------------------------------------------
>
> Key: FLINK-6773
> URL: https://issues.apache.org/jira/browse/FLINK-6773
> Project: Flink
> Issue Type: Improvement
> Components: State Backends, Checkpointing
> Reporter: Stefan Richter
> Assignee: Stefan Richter
>
> We could use compression (e.g. snappy stream compression) to decrease the
> size of our full checkpoints and savepoints. From some initial experiments, I
> think there is great potential to achieve compression rates around 30-50%.
> Given those numbers, I think this is very low hanging fruit to implement.
> One point to consider in the implementation is that compression blocks should
> respect key-groups, i.e. typically it should make sense to compress per
> key-group.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)