Thanks for the update, Piotr!

> Is `state.backend.incremental` the only configuration parameter that can
be
> used in this context?
According to FLIP-193 [1], all the existing checkpoint configurations are
actually for *Snapshot*, ownership (lifecycle) is the only difference
between Checkpoints and Savepoints, and I suggest we keep the description
aligned with FLIP-193.

> a) What about RocksDB upgrades? If we bump RocksDB version between Flink
> versions, do we support recovering from a native format snapshot
> (incremental checkpoint)?
Below are my two cents:
* The functionality of incremental native-format savepoint is (like
*snapshot* in traditional database [2]) to (fast) produce a persisted,
self-contained version of the current state of the job for point-in-time
recovery, but cannot replace canonical savepoint (like *backup* in
traditional database) for upgrading or state-backend-switching, etc.
* We prefer such functionality to be supplied by a *savepoint* instead of a
(retained) *checkpoint* because the life-cycle of the data should be
user-controlled rather than system-controlled [1].
* If we'd like to cover all functionalities the canonical savepoint has
now, the design for incremental *canonical-format* savepoint would be
required, which is more complicated and could be considered as future work.

Best Regards,
Yu

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-193%3A+Snapshots+ownership
[2] https://www.hitechnectar.com/blogs/snapshot-vs-backup/


On Thu, 13 Jan 2022 at 19:40, Piotr Nowojski <pnowoj...@apache.org> wrote:

> Hi,
>
> Thanks for the comments and questions. Starting from the top:
>
> Seth: good point about schema evolution. Actually, I have a very similar
> question to State Processor API. Is it the same scenario in this case?
> Should it also be working with checkpoints but might be just untested?
>
> And next question, should we commit to supporting those two things (State
> Processor API and schema evolution) for native savepoints? What about
> aligned checkpoints? (please check [1] for that).
>
> Yu Li: 1, 2 and 4 done.
>
> > 3. How about changing the description of "the default configuration of
> the
> > checkpoints will be used to determine whether the savepoint should be
> > incremental or not" to something like "the `state.backend.incremental`
> > setting now denotes the type of native format snapshot and will take
> effect
> > for both checkpoint and savepoint (with native type)", to prevent concept
> > confusion between checkpoint and savepoint?
>
> Is `state.backend.incremental` the only configuration parameter that can be
> used in this context? I would guess not? What about for example
> "state.storage.fs.memory-threshold" or all of the Advanced RocksDB State
> Backends Options [2]?
>
> David:
>
> > does this mean that we need to keep the checkpoints compatible across
> minor
> > versions? Or can we say, that the minor version upgrades are only
> > guaranteed with canonical savepoints?
>
> Good question. Frankly I was always assuming that this is implicitly given.
> Otherwise users would not be able to recover jobs that are failing because
> of bugs in Flink. But I'm pretty sure that was never explicitly stated.
>
> As Konstantin suggested, I've written down the pre-existing guarantees of
> checkpoints and savepoints followed by two proposals on how they should be
> changed [1]. Could you take a look?
>
> I'm especially unsure about the following things:
> a) What about RocksDB upgrades? If we bump RocksDB version between Flink
> versions, do we support recovering from a native format snapshot
> (incremental checkpoint)?
> b) State Processor API - both pre-existing and what do we want to provide
> in the future
> c) Schema Evolution - both pre-existing and what do we want to provide in
> the future
>
> Best,
> Piotrek
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-203%3A+Incremental+savepoints#FLIP203:Incrementalsavepoints-Checkpointvssavepointguarantees
> [2]
>
> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#advanced-rocksdb-state-backends-options
>
> wt., 11 sty 2022 o 09:45 Konstantin Knauf <kna...@apache.org> napisał(a):
>
> > Hi Piotr,
> >
> > would it be possible to provide a table that shows the
> > compatibility guarantees provided by the different snapshots going
> forward?
> > Like type of change (Topology. State Schema, Parallelism, ..) in one
> > dimension, and type of snapshot as the other dimension. Based on that, it
> > would be easier to discuss those guarantees, I believe.
> >
> > Cheers,
> >
> > Konstantin
> >
> > On Mon, Jan 3, 2022 at 9:11 AM David Morávek <d...@apache.org> wrote:
> >
> > > Hi Piotr,
> > >
> > > does this mean that we need to keep the checkpoints compatible across
> > minor
> > > versions? Or can we say, that the minor version upgrades are only
> > > guaranteed with canonical savepoints?
> > >
> > > My concern is especially if we'd want to change layout of the
> checkpoint.
> > >
> > > D.
> > >
> > >
> > >
> > > On Wed, Dec 29, 2021 at 5:19 AM Yu Li <car...@gmail.com> wrote:
> > >
> > > > Thanks for the proposal Piotr! Overall I'm +1 for the idea, and below
> > are
> > > > my two cents:
> > > >
> > > > 1. How about adding a "Term Definition" section and clarify what
> > "native
> > > > format" (the "native" data persistence format of the current state
> > > backend)
> > > > and "canonical format" (the "uniform" format that supports switching
> > > state
> > > > backends) means?
> > > >
> > > > 2. IIUC, currently the FLIP proposes to only support incremental
> > > savepoint
> > > > with native format, and there's no plan to add such support for
> > canonical
> > > > format, right? If so, how about writing this down explicitly in the
> > FLIP
> > > > doc, maybe in a "Limitations" section, plus the fact that
> > > > `HashMapStateBackend` cannot support incremental savepoint before
> > > FLIP-151
> > > > is done? (side note: @Roman just a kindly reminder, that please take
> > > > FLIP-203 into account when implementing FLIP-151)
> > > >
> > > > 3. How about changing the description of "the default configuration
> of
> > > the
> > > > checkpoints will be used to determine whether the savepoint should be
> > > > incremental or not" to something like "the
> `state.backend.incremental`
> > > > setting now denotes the type of native format snapshot and will take
> > > effect
> > > > for both checkpoint and savepoint (with native type)", to prevent
> > concept
> > > > confusion between checkpoint and savepoint?
> > > >
> > > > 4. How about putting the notes of behavior change (the default type
> of
> > > > savepoint will be changed to `native` in the future, and by then the
> > > taken
> > > > savepoint cannot be used to switch state backends by default) to a
> more
> > > > obvious place, for example moving from the "CLI" section to the
> > > > "Compatibility" section? (although it will only happen in 1.16
> release
> > > > based on the proposed plan)
> > > >
> > > > And all above suggestions apply for our user-facing document after
> the
> > > FLIP
> > > > is (partially or completely, accordingly) done, if taken (smile).
> > > >
> > > > Best Regards,
> > > > Yu
> > > >
> > > >
> > > > On Tue, 21 Dec 2021 at 22:23, Seth Wiesman <sjwies...@gmail.com>
> > wrote:
> > > >
> > > > > >> AFAIK state schema evolution should work both for native and
> > > canonical
> > > > > >> savepoints.
> > > > >
> > > > > Schema evolution does technically work for both formats, it happens
> > > after
> > > > > the code paths have been unified, but the community has up until
> this
> > > > point
> > > > > considered that an unsupported feature. From my perspective making
> > this
> > > > > supported could be as simple as adding test coverage but that's an
> > > active
> > > > > decision we'd need to make.
> > > > >
> > > > > On Tue, Dec 21, 2021 at 7:43 AM Piotr Nowojski <
> pnowoj...@apache.org
> > >
> > > > > wrote:
> > > > >
> > > > > > Hi Konstantin,
> > > > > >
> > > > > > > In this context: will the native format support state schema
> > > > evolution?
> > > > > > If
> > > > > > > not, I am not sure, we can let the format default to native.
> > > > > >
> > > > > > AFAIK state schema evolution should work both for native and
> > > canonical
> > > > > > savepoints.
> > > > > >
> > > > > > Regarding what is/will be supported we will document as part of
> > this
> > > > > > FLIP-203. But it's not as simple as just the difference between
> > > native
> > > > > and
> > > > > > canonical formats.
> > > > > >
> > > > > > Best, Piotrek
> > > > > >
> > > > > > pon., 20 gru 2021 o 14:28 Konstantin Knauf <kna...@apache.org>
> > > > > napisał(a):
> > > > > >
> > > > > > > Hi Piotr,
> > > > > > >
> > > > > > > Thanks a lot for starting the discussion. Big +1.
> > > > > > >
> > > > > > > In my understanding, this FLIP introduces the snapshot format
> as
> > a
> > > > > > *really*
> > > > > > > user facing concept. IMO it is important that we document
> > > > > > >
> > > > > > > a) that it is not longer the checkpoint/savepoint
> characteristics
> > > > that
> > > > > > > determines the kind of changes that a snapshots allows (user
> > code,
> > > > > state
> > > > > > > schema evolution, topology changes), but now this becomes a
> > > property
> > > > of
> > > > > > the
> > > > > > > format regardless of whether this is a snapshots or a
> checkpoint
> > > > > > > b) the exact changes that each format allows (code, state
> schema,
> > > > > > topology,
> > > > > > > state backend, max parallelism)
> > > > > > >
> > > > > > > In this context: will the native format support state schema
> > > > evolution?
> > > > > > If
> > > > > > > not, I am not sure, we can let the format default to native.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Konstantin
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Dec 20, 2021 at 2:09 PM Piotr Nowojski <
> > > pnowoj...@apache.org
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi devs,
> > > > > > > >
> > > > > > > > I would like to start a discussion about a previously
> announced
> > > > > follow
> > > > > > up
> > > > > > > > of the FLIP-193 [1], namely allowing savepoints to be in
> native
> > > > > format
> > > > > > > and
> > > > > > > > incremental. The changes do not seem invasive. The full
> > proposal
> > > is
> > > > > > > > written down as FLIP-203: Incremental savepoints [2]. Please
> > > take a
> > > > > > look,
> > > > > > > > and let me know what you think.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Piotrek
> > > > > > > >
> > > > > > > > [1]
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-193%3A+Snapshots+ownership
> > > > > > > > [2]
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-203%3A+Incremental+savepoints#FLIP203:Incrementalsavepoints-Semantic
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Konstantin Knauf
> > > > > > >
> > > > > > > https://twitter.com/snntrable
> > > > > > >
> > > > > > > https://github.com/knaufk
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> > --
> >
> > Konstantin Knauf
> >
> > https://twitter.com/snntrable
> >
> > https://github.com/knaufk
> >
>

Reply via email to