Re: [DISCUSS] FLIP-142: Disentangle StateBackends from Checkpointing

Yu Li Thu, 24 Sep 2020 08:24:41 -0700

And to make it clear, I'm +1 on the idea of decoupling state backends with
checkpointing. I don't have any question about making it clear that
heap/RocksDB is where we serve the routine state read/write and where to
put the checkpoint data is another story. My only concern lies in the newly
introduced setCheckpointStorage API and how we define its semantics, and
not sure whether it's due to my ignorance.


Best Regards,
Yu


On Thu, 24 Sep 2020 at 23:11, Yu Li <car...@gmail.com> wrote:

> *bq. What new confusion would be introduced here?*
> No *new* confusion introduced, but as mentioned at the very beginning of
> the motivation ("Apache Flink's durability story is a mystery to many
> users"), I thought the FLIP aims at resolving some *existing*
> confusions, i.e. the durability mystery to users.
>
> For me, I'm not 100% clear about how to write the javadoc of the
> setCheckpointStorage API. Would it be like "specify where the checkpoint
> data is stored"? If so, do we need to explain the fact that when a
> checkpoint path is given, JM will also persist the checkpoint data to DFS?
> It's true that such confusion also exists today, but would the introduction
> of the new API expose it further?
>
> IMHO we need to document the newly introduced API / classes and their
> semantics clearly in the FLIP to make sure everyone is on the same page,
> but if we feel such work / discussions are all details and only need to
> happen at the documenting and release note phase, it's also fine to me.
>
> And if I'm the only one who has such questions / concerns on the new
> `setCheckpointStorage` API and most of others feel its semantic is sound
> and clear, then please just ignore me and move on.
>
> Thanks.
>
> Best Regards,
> Yu
>
>
> On Wed, 23 Sep 2020 at 17:08, Stephan Ewen <se...@apache.org> wrote:
>
>> I am confused now with the concerns here. This is very much from the user
>> perspective (which is partially also the developer perspective which is
>> the
>> sign of an intuitive abstraction).
>>
>> Of course, there will be docs describing what JMCheckpointStorage and
>> FsCheckpointStorage are.
>> And having release notes that describe that
>> RocksDBStateBackend("s3://...")
>> now corresponds to a combination of "RocksDBBackend" and
>> "FsCheckpointStorage" is also straightforward.
>>
>> We said to keep the old RocksDBStateBackend class and let it implement
>> both
>> interfaces such that the old code still works exactly as before.
>>
>> What new confusion would be introduced here?
>> Understanding the difference between JMCheckpointStorage and
>> FsCheckpointStorage was always necessary when one needed to understand the
>> difference between MemoryStateBackend and FsStateBackend. It should be
>> easier to define this after this change, because it is the only thing that
>> we describe when explaining what checkpoint storage to use (rather than
>> also having the choice of index structure coupled to that).
>>
>>
>> On Wed, Sep 23, 2020 at 10:39 AM Aljoscha Krettek <aljos...@apache.org>
>> wrote:
>>
>> > On 23.09.20 04:40, Yu Li wrote:
>> > > To be specific, with the old API users don't need to set checkpoint
>> > > storage, instead they only need to pass the checkpoint path w/o caring
>> > > about the storage. The new APIs are forcing users to set the storage
>> so
>> > > they have to know the difference between different storages. It's not
>> an
>> > > implementation change, but an API change that users have to understand
>> > and
>> > > follow-up.
>> >
>> > I think the main point of the FLIP is to make it more obvious to users
>> > what is happening.
>> >
>> > With current Flink, they would do a `setStateBackend(new
>> > FsStateBackend(<path>))`. What the user is actually "saying" with this
>> > is: I want to keep state on heap but store checkpoints in DFS. They are
>> > not actually changing the "State Backend", the thing that keeps state in
>> > operators, but only where state is checkpointed. The thing that is used
>> > for local state storage in operators is still the "Heap Backend".
>> >
>> > With the proposed FLIP, a user would do a `setCheckpointStorage(new
>> > FsStorage(<path>))`. Which makes it obvious that they're changing where
>> > checkpoints are stored but not the actual "State Backend", which is
>> > still "Heap Backend" (the default).
>> >
>> > I do understand Yu's point, though, that this will be confusing for
>> > current Flink users. They are used to setting a "State Backend" if/when
>> > they want to change the storage location. To fit the new model they
>> > would have to change the call from `setStateBackend()` to
>> > `setCheckpointStorage()`.
>> >
>> > I think we need to life with this short-term confusion because in the
>> > long run the proposed split between checkpoint location and state
>> > backend makes sense and will be more straightforward for users to
>> > understand.
>> >
>> > Best,
>> > Aljoscha
>> >
>> >
>>
>

Re: [DISCUSS] FLIP-142: Disentangle StateBackends from Checkpointing

Reply via email to