[
https://issues.apache.org/jira/browse/FLINK-6484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Flink Jira Bot updated FLINK-6484:
----------------------------------
Labels: stale-major (was: )
> Send only the registry keys for already registered files in incremental
> checkpointing
> -------------------------------------------------------------------------------------
>
> Key: FLINK-6484
> URL: https://issues.apache.org/jira/browse/FLINK-6484
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Checkpointing
> Reporter: Stefan Richter
> Priority: Major
> Labels: stale-major
>
> State handles for files that are (potentially) shared across multiple
> incremental snapshots are registered under an explicit key in the
> {{SharedStateRegistry}}. Furthermore, the backend is aware which of those
> files are new (unregistered) and which are just references to a previous
> snapshot (already) registered.
> The current implementation of incremental checkpoints in RocksDB always
> includes _all_ state handles (unregistered and previously registered) that
> are referenced by the snapshot in the ack message to the checkpoint
> coordinator, assuming that state handles are lightweight pointers.
> While this assumption is true in general, there are notable exceptions such
> as the {{ByteStreamStateHandle}}, which pack the actual payload data into the
> state handle. While the maximum capacity for each {{ByteStreamStateHandle}}
> is limited (around 4MB), multiple handles can be part of a snapshot. This
> makes incremental snapshots over {{ByteStreamStateHandle}} essentially not
> incremental and can lead to huge ack messages.
> To avoid this issue, I propose that we only send the registration key for all
> previously registered state handles and fetch the corresponding state handle
> from the {{SharedStateRegistry}} in the checkpoint coordinator to insert them
> into the checkpoint data.
> While this idea makes the incremental snapshots over
> {{ByteStreamStateHandle}} truly incremental, one problem that remains is on
> restore, when all state handles are send to the operators again. At this
> time, all state handles of the increments will become part of a they
> deployment RPC message. As far as I can see, this will only be solved when
> sending the deployment messages indirectly as pointers to actual deployment
> data in the blob store.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)