[
https://issues.apache.org/jira/browse/FLINK-5036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15650354#comment-15650354
]
Stephan Ewen commented on FLINK-5036:
-------------------------------------
Actually, the checkpointing operation is very cheap, as the data is
pre-organized into key groups already.
There was quite a long design process in getting this fleshed out and it seems
to work well. I think we should not change this.
As a general design thought:
Fast recovery is very important. A system that checkpoints slightly faster but
where recovery takes much longer is not desirable. It misses more SLAs than a
system that has slightly higher checkpoint overhead but faster recovery.
> Perform the grouping of keys in restoring instead of checkpointing
> ------------------------------------------------------------------
>
> Key: FLINK-5036
> URL: https://issues.apache.org/jira/browse/FLINK-5036
> Project: Flink
> Issue Type: Bug
> Components: State Backends, Checkpointing
> Reporter: Xiaogang Shi
>
> Whenever taking snapshots of {{RocksDBKeyedStateBackend}}, the values in the
> states will be written onto different files according to their key groups.
> The procedure is very costly when the states are very big.
> Given that the snapshot operations will be performed much more frequently
> than restoring, we can leave the key groups as they are to improve the
> overall performance. In other words, we can perform the grouping of keys in
> restoring instead of in checkpointing.
> I think, the implementation will be very similar to the restoring of
> non-partitioned states. Each task will receive a collection of snapshots each
> of which contains a set of key groups. Each task will restore its states from
> the given snapshots by picking values in assigned key groups.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)