Re: [DISCUSS] FLIP-142: Disentangle StateBackends from Checkpointing

Aljoscha Krettek Wed, 23 Sep 2020 01:39:11 -0700

On 23.09.20 04:40, Yu Li wrote:

To be specific, with the old API users don't need to set checkpoint
storage, instead they only need to pass the checkpoint path w/o caring
about the storage. The new APIs are forcing users to set the storage so
they have to know the difference between different storages. It's not an
implementation change, but an API change that users have to understand and
follow-up.

I think the main point of the FLIP is to make it more obvious to userswhat is happening.

With current Flink, they would do a `setStateBackend(newFsStateBackend(<path>))`. What the user is actually "saying" with thisis: I want to keep state on heap but store checkpoints in DFS. They arenot actually changing the "State Backend", the thing that keeps state inoperators, but only where state is checkpointed. The thing that is usedfor local state storage in operators is still the "Heap Backend".

With the proposed FLIP, a user would do a `setCheckpointStorage(newFsStorage(<path>))`. Which makes it obvious that they're changing wherecheckpoints are stored but not the actual "State Backend", which isstill "Heap Backend" (the default).

I do understand Yu's point, though, that this will be confusing forcurrent Flink users. They are used to setting a "State Backend" if/whenthey want to change the storage location. To fit the new model theywould have to change the call from `setStateBackend()` to`setCheckpointStorage()`.

I think we need to life with this short-term confusion because in thelong run the proposed split between checkpoint location and statebackend makes sense and will be more straightforward for users tounderstand.


Best,
Aljoscha

Re: [DISCUSS] FLIP-142: Disentangle StateBackends from Checkpointing

Reply via email to