[ 
https://issues.apache.org/jira/browse/FLINK-6484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-6484:
----------------------------------
    Labels: stale-major  (was: )

> Send only the registry keys for already registered files in incremental 
> checkpointing
> -------------------------------------------------------------------------------------
>
>                 Key: FLINK-6484
>                 URL: https://issues.apache.org/jira/browse/FLINK-6484
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Checkpointing
>            Reporter: Stefan Richter
>            Priority: Major
>              Labels: stale-major
>
> State handles for files that are (potentially) shared across multiple 
> incremental snapshots are registered under an explicit key in the 
> {{SharedStateRegistry}}. Furthermore, the backend is aware which of those 
> files are new (unregistered) and which are just references to a previous 
> snapshot (already) registered.
> The current implementation of incremental checkpoints in RocksDB always 
> includes _all_ state handles (unregistered and previously registered) that 
> are referenced by the snapshot in the ack message to the checkpoint 
> coordinator, assuming that state handles are lightweight pointers.
> While this assumption is true in general, there are notable exceptions such 
> as the {{ByteStreamStateHandle}}, which pack the actual payload data into the 
> state handle. While the maximum capacity for each {{ByteStreamStateHandle}} 
> is limited (around 4MB), multiple handles can be part of a snapshot. This 
> makes incremental snapshots over {{ByteStreamStateHandle}} essentially not 
> incremental and can lead to huge ack messages.
> To avoid this issue, I propose that we only send the registration key for all 
> previously registered state handles and fetch the corresponding state handle 
> from the {{SharedStateRegistry}} in the checkpoint coordinator to insert them 
> into the checkpoint data.
> While this idea makes the incremental snapshots over 
> {{ByteStreamStateHandle}} truly incremental, one problem that remains is on 
> restore, when all state handles are send to the operators again. At this 
> time, all state handles of the increments will become part of a they 
> deployment RPC message. As far as I can see, this will only be solved when 
> sending the deployment messages indirectly as pointers to actual deployment 
> data in the blob store.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to