Re: [DISCUSS] FLIP-190: Support Version Upgrades for Table API & SQL Programs

wenlong.lwl Mon, 13 Dec 2021 00:21:01 -0800

Hi, Timo, +1 for the improvement too. Thanks for the great job.

Looking forward to the next progress of the FLIP, I could help on the
development of some of the specific improvements.


Best,
Wenlong

On Mon, 13 Dec 2021 at 14:43, godfrey he <[email protected]> wrote:

> Hi Timo,
>
> +1 for the improvement.
>
> Best,
> Godfrey
>
> Timo Walther <[email protected]> 于2021年12月10日周五 20:37写道：
> >
> > Hi Wenlong,
> >
> > yes it will. Sorry for the confusion. This is a logical consequence of
> > the assumption:
> >
> > The JSON plan contains no implementation details (i.e. no classes) and
> > is fully declarative.
> >
> > I will add a remark.
> >
> > Thanks,
> > Timo
> >
> >
> > On 10.12.21 11:43, wenlong.lwl wrote:
> > > hi, Timo, thanks for the explanation. I totally agree with what you
> said.
> > > My actual question is: Will the version of an exec node be serialised
> in
> > > the Json Plan? In my understanding, it is not in the former design. If
> it
> > > is yes, my question is solved already.
> > >
> > >
> > > Best,
> > > Wenlong
> > >
> > >
> > > On Fri, 10 Dec 2021 at 18:15, Timo Walther <[email protected]> wrote:
> > >
> > >> Hi Wenlong,
> > >>
> > >> also thought about adding a `flinkVersion` field per ExecNode. But
> this
> > >> is not necessary, because the `version` of the ExecNode has the same
> > >> purpose.
> > >>
> > >> The plan version just encodes that:
> > >> "plan has been updated in Flink 1.17" / "plan is entirely valid for
> > >> Flink 1.17"
> > >>
> > >> The ExecNode version maps to `minStateVersion` to verify state
> > >> compatibility.
> > >>
> > >> So even if the plan version is 1.17, some ExecNodes use state layout
> of
> > >> 1.15.
> > >>
> > >> It is totally fine to only update the ExecNode to version 2 and not 3
> in
> > >> your example.
> > >>
> > >> Regards,
> > >> Timo
> > >>
> > >>
> > >>
> > >> On 10.12.21 06:02, wenlong.lwl wrote:
> > >>> Hi, Timo, thanks for updating the doc.
> > >>>
> > >>> I have a comment on plan migration:
> > >>> I think we may need to add a version field for every exec node when
> > >>> serialising. In earlier discussions, I think we have a conclusion
> that
> > >>> treating the version of plan as the version of node, but in this
> case it
> > >>> would be broken.
> > >>> Take the following example in FLIP into consideration, there is a bad
> > >> case:
> > >>> when in 1.17, we introduced an incompatible version 3 and dropped
> version
> > >>> 1, we can only update the version to 2, so the version should be per
> exec
> > >>> node.
> > >>>
> > >>> ExecNode version *1* is not supported anymore. Even though the state
> is
> > >>> actually compatible. The plan restore will fail with a helpful
> exception
> > >>> that forces users to perform plan migration.
> > >>>
> > >>> COMPILE PLAN '/mydir/plan_new.json' FROM '/mydir/plan_old.json';
> > >>>
> > >>> The plan migration will safely replace the old version *1* with *2.
> The
> > >>> JSON plan flinkVersion changes to 1.17.*
> > >>>
> > >>>
> > >>> Best,
> > >>>
> > >>> Wenlong
> > >>>
> > >>> On Thu, 9 Dec 2021 at 18:36, Timo Walther <[email protected]>
> wrote:
> > >>>
> > >>>> Hi Jing and Godfrey,
> > >>>>
> > >>>> I had another iteration over the document. There are two major
> changes:
> > >>>>
> > >>>> 1. Supported Flink Upgrade Versions
> > >>>>
> > >>>> I got the feedback via various channels that a step size of one
> minor
> > >>>> version is not very convenient. As you said, "because upgrading to
> a new
> > >>>> version is a time-consuming process". I rephrased this section:
> > >>>>
> > >>>> Upgrading usually involves work which is why many users perform this
> > >>>> task rarely (e.g. only once per year). Also skipping a versions is
> > >>>> common until a new feature has been introduced for which is it
> worth to
> > >>>> upgrade. We will support the upgrade to the most recent Flink
> version
> > >>>> from a set of previous versions. We aim to support upgrades from the
> > >>>> last 2-3 releases on a best-effort basis; maybe even more depending
> on
> > >>>> the maintenance overhead. However, in order to not grow the testing
> > >>>> matrix infinitely and to perform important refactoring if
> necessary, we
> > >>>> only guarantee upgrades with a step size of a single minor version
> (i.e.
> > >>>> a cascade of upgrades).
> > >>>>
> > >>>> 2. Annotation Design
> > >>>>
> > >>>> I also adopted the multiple annotations design for the previous
> > >>>> supportPlanFormat. So no array of versions anymore. I reworked the
> > >>>> section, please have a look with updated examples:
> > >>>>
> > >>>>
> > >>>>
> > >>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336489#FLIP190:SupportVersionUpgradesforTableAPI&SQLPrograms-ExecNodeTests
> > >>>>
> > >>>> I also got the feedback offline that `savepoint` might not be the
> right
> > >>>> terminology for the annotation. I changed that to minPlanVersion and
> > >>>> minStateVersion.
> > >>>>
> > >>>> Let me know what you think.
> > >>>>
> > >>>> Regards,
> > >>>> Timo
> > >>>>
> > >>>>
> > >>>>
> > >>>> On 09.12.21 08:44, Jing Zhang wrote:
> > >>>>> Hi Timo,
> > >>>>> Thanks a lot for driving this discussion.
> > >>>>> I believe it could solve many problems what we are suffering in
> > >>>> upgrading.
> > >>>>>
> > >>>>> I only have a little complain on the following point.
> > >>>>>
> > >>>>>> For simplification of the design, we assume that upgrades use a
> step
> > >>>> size
> > >>>>> of a single minor version. We don't guarantee skipping minor
> versions
> > >>>> (e.g.
> > >>>>> 1.11 to
> > >>>>> 1.14).
> > >>>>>
> > >>>>> In our internal production environment, we follow up with the
> > >> community's
> > >>>>> latest stable release version almost once a year because upgrading
> to a
> > >>>> new
> > >>>>> version is a time-consuming process.
> > >>>>> So we might missed 1~3 version after we upgrade to the latest
> version.
> > >>>> This
> > >>>>> might also appears in other company too.
> > >>>>> Could we guarantee FLIP-190 work if we skip minor versions less
> than
> > >>>>> specified threshold?
> > >>>>> Then we could know which version is good for us when prepare
> upgrading.
> > >>>>>
> > >>>>> Best,
> > >>>>> Jing Zhang
> > >>>>>
> > >>>>> godfrey he <[email protected]> 于2021年12月8日周三 22:16写道：
> > >>>>>
> > >>>>>> Hi Timo,
> > >>>>>>
> > >>>>>> Thanks for the explanation, it's much clearer now.
> > >>>>>>
> > >>>>>> One thing I want to confirm about `supportedPlanFormat `
> > >>>>>> and `supportedSavepointFormat `:
> > >>>>>> `supportedPlanFormat ` supports multiple versions,
> > >>>>>> while `supportedSavepointFormat ` supports only one version ?
> > >>>>>> A json plan  can be deserialized by multiple versions
> > >>>>>> because default value will be set for new fields.
> > >>>>>> In theory, a Savepoint can be restored by more than one version
> > >>>>>> of the operators even if a state layout is changed,
> > >>>>>> such as deleting a whole state and starting job with
> > >>>>>> `allowNonRestoredState`=true.
> > >>>>>> I think this is a corner case, and it's hard to understand
> comparing
> > >>>>>> to `supportedPlanFormat ` supporting multiple versions.
> > >>>>>> So, for most cases, when the state layout is changed, the
> savepoint is
> > >>>>>> incompatible,
> > >>>>>> and `supportedSavepointFormat` and version need to be changed.
> > >>>>>>
> > >>>>>> I think we need a detail explanation about the annotations change
> > >> story
> > >>>> in
> > >>>>>> the java doc of  `ExecNodeMetadata` class for all developers
> > >>>>>> (esp. those unfamiliar with this part).
> > >>>>>>
> > >>>>>> Best,
> > >>>>>> Godfrey
> > >>>>>>
> > >>>>>> Timo Walther <[email protected]> 于2021年12月8日周三 下午4:57写道：
> > >>>>>>>
> > >>>>>>> Hi Wenlong,
> > >>>>>>>
> > >>>>>>> thanks for the feedback. Great that we reached consensus here. I
> will
> > >>>>>>> update the entire document with my previous example shortly.
> > >>>>>>>
> > >>>>>>>     > if we don't update the version when plan format changes, we
> > >> can't
> > >>>>>>> find that the plan can't not be deserialized in 1.15
> > >>>>>>>
> > >>>>>>> This should not be a problem as the entire plan file has a
> version as
> > >>>>>>> well. We should not allow reading a 1.16 plan in 1.15. We can
> throw a
> > >>>>>>> helpful exception early.
> > >>>>>>>
> > >>>>>>> Reading a 1.15 plan in 1.16 is possible until we drop the old
> > >>>>>>> `supportedPlanFormat` from one of used ExecNodes. Afterwards all
> > >>>>>>> `supportedPlanFormat` of ExecNodes must be equal or higher then
> the
> > >>>> plan
> > >>>>>>> version.
> > >>>>>>>
> > >>>>>>> Regards,
> > >>>>>>> Timo
> > >>>>>>>
> > >>>>>>> On 08.12.21 03:07, wenlong.lwl wrote:
> > >>>>>>>> Hi, Timo,  +1 for multi metadata.
> > >>>>>>>>
> > >>>>>>>> The compatible change I mean in the last email is the slight
> state
> > >>>>>> change
> > >>>>>>>> example you gave, so we have got  consensus on this actually,
> IMO.
> > >>>>>>>>
> > >>>>>>>> Another question based on the example you gave:
> > >>>>>>>> In the example "JSON node gets an additional property in 1.16",
> if
> > >> we
> > >>>>>> don't
> > >>>>>>>> update the version when plan format changes, we can't find that
> the
> > >>>>>> plan
> > >>>>>>>> can't not be deserialized in 1.15, although the savepoint state
> is
> > >>>>>>>> compatible.
> > >>>>>>>> The error message may be not so friendly if we just throw
> > >>>>>> deserialization
> > >>>>>>>> failure.
> > >>>>>>>>
> > >>>>>>>> On Tue, 7 Dec 2021 at 16:49, Timo Walther <[email protected]>
> > >> wrote:
> > >>>>>>>>
> > >>>>>>>>> Hi Wenlong,
> > >>>>>>>>>
> > >>>>>>>>>      > First,  we add a newStateLayout because of some
> improvement
> > >> in
> > >>>>>> state, in
> > >>>>>>>>>      > order to keep compatibility we may still keep the old
> state
> > >> for
> > >>>>>> the
> > >>>>>>>>> first
> > >>>>>>>>>      > version. We need to update the version, so that we can
> > >> generate
> > >>>> a
> > >>>>>> new
> > >>>>>>>>>      > version plan for the new job and keep the exec node
> > >> compatible
> > >>>>>> with
> > >>>>>>>>> the old
> > >>>>>>>>>      > version plan.
> > >>>>>>>>>
> > >>>>>>>>> The problem that I see here for contributors is that the actual
> > >>>> update
> > >>>>>>>>> of a version is more complicated than just updating an integer
> > >> value.
> > >>>>>> It
> > >>>>>>>>> means copying a lot of ExecNode code for a change that happens
> > >>>> locally
> > >>>>>>>>> in an operator. Let's assume multiple ExecNodes use a similar
> > >>>>>> operator.
> > >>>>>>>>> Why do we need to update all ExecNode versions, if the operator
> > >>>> itself
> > >>>>>>>>> can deal with the incompatibility. The ExecNode version is
> meant
> > >> for
> > >>>>>>>>> topology changes or fundamental state changes.
> > >>>>>>>>>
> > >>>>>>>>> If we don't find consensus on this topic, I would at least
> vote for
> > >>>>>>>>> supporting multiple annotations for an ExecNode class. This
> way we
> > >>>>>> don't
> > >>>>>>>>> need to copy code but only add two ExecNode annotations with
> > >>>> different
> > >>>>>>>>> ExecNode versions.
> > >>>>>>>>>
> > >>>>>>>>>      > Maybe we can add support for this case :
> > >>>>>>>>>      > when an exec node is changed in 1.16, but is compatible
> with
> > >>>> 1.15,
> > >>>>>>>>>      > we can use the node of 1.16 to deserialize the plan of
> 1.15.
> > >>>>>>>>>
> > >>>>>>>>> If the ExecNode is compatible, there is no reason to increase
> the
> > >>>>>>>>> ExecNode version.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> I tried to come up with a reworked solution to make all parties
> > >>>> happy:
> > >>>>>>>>>
> > >>>>>>>>> 1. Let's assume the following annotations:
> > >>>>>>>>>
> > >>>>>>>>> supportedPlanFormat = [1.15]
> > >>>>>>>>>
> > >>>>>>>>> supportedSavepointFormat = 1.15
> > >>>>>>>>>
> > >>>>>>>>> we drop `added` as it is equal to `supportedSavepointFormat`
> > >>>>>>>>>
> > >>>>>>>>> 2. Multiple annotations over ExecNodes are possible:
> > >>>>>>>>>
> > >>>>>>>>> // operator state changes
> > >>>>>>>>>
> > >>>>>>>>> // initial introduction in 1.15
> > >>>>>>>>> @ExecNodeMetadata(name=A, version=1, supportedPlanFormat=1.15,
> > >>>>>>>>> supportedSavepointFormat=1.15)
> > >>>>>>>>>
> > >>>>>>>>> // state layout changed slightly in 1.16
> > >>>>>>>>> // - operator migration is possible
> > >>>>>>>>> // - operator supports state of both versions and will perform
> > >>>>>> operator
> > >>>>>>>>> state migration
> > >>>>>>>>> // - new plans will get new ExecNode version
> > >>>>>>>>> @ExecNodeMetadata(name=A, version=1, supportedPlanFormat=1.15,
> > >>>>>>>>> supportedSavepointFormat=1.15)
> > >>>>>>>>> @ExecNodeMetadata(name=A, version=2, supportedPlanFormat=1.15,
> > >>>>>>>>> supportedSavepointFormat=1.16)
> > >>>>>>>>>
> > >>>>>>>>> // we force a plan migration in 1.17
> > >>>>>>>>> // - we assume that all operator states have been migrated in
> the
> > >>>>>>>>> previous version
> > >>>>>>>>> // - we can safely replace the old version `1` with `2` and
> only
> > >> keep
> > >>>>>>>>> the new savepoint format
> > >>>>>>>>> @ExecNodeMetadata(name=A, version=2, supportedPlanFormat=1.15,
> > >>>>>>>>> supportedSavepointFormat=1.16)
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> // plan changes
> > >>>>>>>>>
> > >>>>>>>>> // initial introduction in 1.15
> > >>>>>>>>> @ExecNodeMetadata(name=A, version=1, supportedPlanFormat=1.15,
> > >>>>>>>>> supportedSavepointFormat=1.15)
> > >>>>>>>>>
> > >>>>>>>>> // JSON node gets an additional property in 1.16
> > >>>>>>>>> // e.g. { some-prop: 42 } -> { some-prop: 42, some-flag: false}
> > >>>>>>>>> // - ExecNode version does not change
> > >>>>>>>>> // - ExecNode version only changes when topology or state is
> > >> affected
> > >>>>>>>>> // - we support both JSON plan formats, the old and the newest
> one
> > >>>>>>>>> @ExecNodeMetadata(name=A, version=1, supportedPlanFormat=[1.15,
> > >>>> 1.16],
> > >>>>>>>>> supportedSavepointFormat=1.15)
> > >>>>>>>>>
> > >>>>>>>>> // we force a plan migration in 1.17
> > >>>>>>>>> // - now we only support 1.16 plan format
> > >>>>>>>>> @ExecNodeMetadata(name=A, version=1, supportedPlanFormat=1.16,
> > >>>>>>>>> supportedSavepointFormat=1.15)
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> // topology change
> > >>>>>>>>>
> > >>>>>>>>> // initial introduction in 1.15
> > >>>>>>>>> @ExecNodeMetadata(name=A, version=1, supportedPlanFormat=1.15,
> > >>>>>>>>> supportedSavepointFormat=1.15)
> > >>>>>>>>>
> > >>>>>>>>> // complete new class structure in 1.16 annotated with
> > >>>>>>>>> @ExecNodeMetadata(name=A, version=2, supportedPlanFormat=1.15,
> > >>>>>>>>> supportedSavepointFormat=1.16)
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> What do you think?
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Regards,
> > >>>>>>>>> Timo
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> On 07.12.21 08:20, wenlong.lwl wrote:
> > >>>>>>>>>> Maybe we can add support for this case :
> > >>>>>>>>>>              when an exec node is changed in 1.16, but is
> > >> compatible
> > >>>>>> with
> > >>>>>>>>> 1.15,
> > >>>>>>>>>> we can use the node of 1.16 to deserialize the plan of 1.15.
> > >>>>>>>>>> By this way, we don't need to fork the code if the change is
> > >>>>>> compatible,
> > >>>>>>>>>> and can avoid fork code frequently.
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> Best,
> > >>>>>>>>>> Wenlong
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> On Tue, 7 Dec 2021 at 15:08, wenlong.lwl <
> [email protected]
> > >>>
> > >>>>>>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>> hi, Timo, I would prefer to update the version every time we
> > >> change
> > >>>>>> the
> > >>>>>>>>>>> state layer too.
> > >>>>>>>>>>>
> > >>>>>>>>>>> It could be possible that we change the exec node in 2 steps:
> > >>>>>>>>>>> First,  we add a newStateLayout because of some improvement
> in
> > >>>>>> state, in
> > >>>>>>>>>>> order to keep compatibility we may still keep the old state
> for
> > >> the
> > >>>>>>>>> first
> > >>>>>>>>>>> version. We need to update the version, so that we can
> generate a
> > >>>>>> new
> > >>>>>>>>>>> version plan for the new job and keep the exec node
> compatible
> > >> with
> > >>>>>> the
> > >>>>>>>>> old
> > >>>>>>>>>>> version plan.
> > >>>>>>>>>>> After some versions, we may remove the old version state
> layout
> > >> and
> > >>>>>>>>> clean
> > >>>>>>>>>>> up the deprecated code. We still need to update the version,
> so
> > >>>>>> that we
> > >>>>>>>>> can
> > >>>>>>>>>>> verify that we are compatible with the plan after the first
> > >> change,
> > >>>>>> but
> > >>>>>>>>> not
> > >>>>>>>>>>> compatible with the plan earlier.
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> Best,
> > >>>>>>>>>>> Wenlong
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Mon, 6 Dec 2021 at 21:27, Timo Walther <
> [email protected]>
> > >>>>>> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> Hi Godfrey,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>       > design makes thing more complex.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Yes, the design might be a bit more complex. But operator
> > >>>>>> migration is
> > >>>>>>>>>>>> way easier than ExecNode migration at a later point in time
> for
> > >>>>>> code
> > >>>>>>>>>>>> maintenance. We know that ExecNodes can become pretty
> complex.
> > >>>> Even
> > >>>>>>>>>>>> though we have put a lot of code into `CommonXXExecNode` it
> will
> > >>>>>> be a
> > >>>>>>>>>>>> lot of work to maintain multiple versions of ExecNodes. If
> we
> > >> can
> > >>>>>> avoid
> > >>>>>>>>>>>> this with operator state migration, this should always be
> > >>>> preferred
> > >>>>>>>>> over
> > >>>>>>>>>>>> a new ExecNode version.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I'm aware that operator state migration might only be
> important
> > >>>> for
> > >>>>>>>>>>>> roughly 10 % of all changes. A new ExecNode version will be
> used
> > >>>>>> for
> > >>>>>>>>> 90%
> > >>>>>>>>>>>> of all changes.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>       > If there are multiple state layouts, which layout
> the
> > >>>> ExecNode
> > >>>>>>>>> should
> > >>>>>>>>>>>> use?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> It is not the responsibility of the ExecNode to decide this
> but
> > >>>> the
> > >>>>>>>>>>>> operator. Something like:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> class X extends ProcessFunction {
> > >>>>>>>>>>>>         ValueState<A> oldStateLayout;
> > >>>>>>>>>>>>         ValueState<B> newStateLayout;
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>         open() {
> > >>>>>>>>>>>>           if (oldStateLayout.get() != null) {
> > >>>>>>>>>>>>             performOperatorMigration();
> > >>>>>>>>>>>>           }
> > >>>>>>>>>>>>           useNewStateLayout();
> > >>>>>>>>>>>>         }
> > >>>>>>>>>>>> }
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Operator migration is meant for smaller "more local" changes
> > >>>>>> without
> > >>>>>>>>>>>> touching the ExecNode layer. The CEP library and DataStream
> API
> > >>>>>> sources
> > >>>>>>>>>>>> are performing operator migration for years already.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>       > `supportedPlanChanges ` and
> `supportedSavepointChanges `
> > >> are
> > >>>>>> a bit
> > >>>>>>>>>>>> obscure.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Let me try to come up with more examples why I think both
> > >>>>>> annotation
> > >>>>>>>>>>>> make sense and are esp. important *for test coverage*.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> supportedPlanChanges:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Let's assume we have some JSON in Flink 1.15:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> {
> > >>>>>>>>>>>>         some-prop: 42
> > >>>>>>>>>>>> }
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> And we want to extend the JSON in Flink 1.16:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> {
> > >>>>>>>>>>>>         some-prop: 42,
> > >>>>>>>>>>>>         some-flag: false
> > >>>>>>>>>>>> }
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Maybe we don't need to increase the ExecNode version but
> only
> > >>>>>> ensure
> > >>>>>>>>>>>> that the flag is set to `false` by default for the older
> > >> versions.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> We need a location to track changes and document the
> changelog.
> > >>>>>> With
> > >>>>>>>>> the
> > >>>>>>>>>>>> help of the annotation supportedPlanChanges = [1.15, 1.16]
> we
> > >> can
> > >>>>>>>>> verify
> > >>>>>>>>>>>> that we have tests for both JSON formats.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> And once we decide to drop the 1.15 format, we enforce plan
> > >>>>>> migration
> > >>>>>>>>>>>> and fill-in the default value `false` into the old plans and
> > >> bump
> > >>>>>> their
> > >>>>>>>>>>>> JSON plan version to 1.16 or higher.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>       > once the state layout is changed, the ExecNode
> version
> > >> needs
> > >>>>>> also
> > >>>>>>>>> be
> > >>>>>>>>>>>> updated
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> This will still be the majority of cases. But if we can
> avoid
> > >>>>>> this, we
> > >>>>>>>>>>>> should do it for not having too much duplicate code to
> maintain.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>> Timo
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> On 06.12.21 09:58, godfrey he wrote:
> > >>>>>>>>>>>>> Hi, Timo,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Thanks for the detailed explanation.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> We change an operator state of B in Flink 1.16. We
> perform the
> > >>>>>> change
> > >>>>>>>>>>>> in the operator of B in a way to support both state layouts.
> > >> Thus,
> > >>>>>> no
> > >>>>>>>>> need
> > >>>>>>>>>>>> for a new ExecNode version.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> I think this design makes thing more complex.
> > >>>>>>>>>>>>> 1. If there are multiple state layouts, which layout the
> > >> ExecNode
> > >>>>>>>>>>>> should use ?
> > >>>>>>>>>>>>> It increases the cost of understanding for developers
> > >> (especially
> > >>>>>> for
> > >>>>>>>>>>>>> Flink newer),
> > >>>>>>>>>>>>> making them prone to mistakes.
> > >>>>>>>>>>>>> 2. `supportedPlanChanges ` and `supportedSavepointChanges `
> > >> are a
> > >>>>>> bit
> > >>>>>>>>>>>> obscure.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> The purpose of ExecNode annotations are not only to support
> > >>>>>> powerful
> > >>>>>>>>>>>> validation,
> > >>>>>>>>>>>>> but more importantly to make it easy for developers to
> > >> understand
> > >>>>>>>>>>>>> to ensure that every modification is easy and state
> compatible.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> I prefer, once the state layout is changed, the ExecNode
> > >> version
> > >>>>>> needs
> > >>>>>>>>>>>>> also be updated.
> > >>>>>>>>>>>>> which could make thing simple. How about
> > >>>>>>>>>>>>> rename `supportedPlanChanges ` to `planCompatibleVersion`
> > >>>>>>>>>>>>> (which means the plan is compatible with the plan
> generated by
> > >>>> the
> > >>>>>>>>>>>>> given version node)
> > >>>>>>>>>>>>>        and rename `supportedSavepointChanges` to
> > >>>>>>>>> `savepointCompatibleVersion
> > >>>>>>>>>>>> `
> > >>>>>>>>>>>>> (which means the state is compatible with the state
> generated
> > >> by
> > >>>>>> the
> > >>>>>>>>>>>>> given version node) ?
> > >>>>>>>>>>>>> The names also indicate that only one version value can be
> set.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> WDYT?
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>> Godfrey
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Timo Walther <[email protected]> 于2021年12月2日周四 下午11:42写道：
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Response to Marios's feedback:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>        > there should be some good logging in place when
> the
> > >>>>>> upgrade is
> > >>>>>>>>>>>> taking
> > >>>>>>>>>>>>>> place
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Yes, I agree. I added this part to the FLIP.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>        > config option instead that doesn't provide the
> > >>>> flexibility
> > >>>>>> to
> > >>>>>>>>>>>>>> overwrite certain plans
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> One can set the config option also around sections of the
> > >>>>>>>>>>>>>> multi-statement SQL script.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> SET 'table.plan.force-recompile'='true';
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> COMPILE ...
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> SET 'table.plan.force-recompile'='false';
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> But the question is why a user wants to run COMPILE
> multiple
> > >>>>>> times.
> > >>>>>>>>> If
> > >>>>>>>>>>>>>> it is during development, then running EXECUTE (or just
> the
> > >>>>>> statement
> > >>>>>>>>>>>>>> itself) without calling COMPILE should be sufficient. The
> file
> > >>>>>> can
> > >>>>>>>>> also
> > >>>>>>>>>>>>>> manually be deleted if necessary.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> What do you think?
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>> Timo
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> On 02.12.21 16:09, Timo Walther wrote:
> > >>>>>>>>>>>>>>> Hi Till,
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Yes, you might have to. But not a new plan from the SQL
> query
> > >>>>>> but a
> > >>>>>>>>>>>>>>> migration from the old plan to the new plan. This will
> not
> > >>>>>> happen
> > >>>>>>>>>>>> often.
> > >>>>>>>>>>>>>>> But we need a way to evolve the format of the JSON plan
> > >> itself.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Maybe this confuses a bit, so let me clarify it again:
> Mostly
> > >>>>>>>>> ExecNode
> > >>>>>>>>>>>>>>> versions and operator state layouts will evolve. Not the
> plan
> > >>>>>> files,
> > >>>>>>>>>>>>>>> those will be pretty stable. But also not infinitely.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>>> Timo
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> On 02.12.21 16:01, Till Rohrmann wrote:
> > >>>>>>>>>>>>>>>> Then for migrating from Flink 1.10 to 1.12, I might
> have to
> > >>>>>> create
> > >>>>>>>>> a
> > >>>>>>>>>>>> new
> > >>>>>>>>>>>>>>>> plan using Flink 1.11 in order to migrate from Flink
> 1.11 to
> > >>>>>> 1.12,
> > >>>>>>>>>>>> right?
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Cheers,
> > >>>>>>>>>>>>>>>> Till
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> On Thu, Dec 2, 2021 at 3:39 PM Timo Walther <
> > >>>>>> [email protected]>
> > >>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Response to Till's feedback:
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>         > compiled plan won't be changed after being
> written
> > >>>>>> initially
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> This is not entirely correct. We give guarantees for
> > >> keeping
> > >>>>>> the
> > >>>>>>>>>>>> query
> > >>>>>>>>>>>>>>>>> up and running. We reserve us the right to force plan
> > >>>>>> migrations.
> > >>>>>>>>> In
> > >>>>>>>>>>>>>>>>> this case, the plan might not be created from the SQL
> > >>>>>> statement
> > >>>>>>>>> but
> > >>>>>>>>>>>> from
> > >>>>>>>>>>>>>>>>> the old plan. I have added an example in section
> 10.1.1. In
> > >>>>>>>>> general,
> > >>>>>>>>>>>>>>>>> both persisted entities "plan" and "savepoint" can
> evolve
> > >>>>>>>>>>>> independently
> > >>>>>>>>>>>>>>>>> from each other.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>>>>>>> Timo
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> On 02.12.21 15:10, Timo Walther wrote:
> > >>>>>>>>>>>>>>>>>> Response to Godfrey's feedback:
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>         > "EXPLAIN PLAN EXECUTE STATEMENT SET BEGIN
> ...
> > >> END"
> > >>>> is
> > >>>>>>>>> missing.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Thanks for the hint. I added a dedicated section
> 7.1.3.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>         > it's hard to maintain the supported
> versions for
> > >>>>>>>>>>>>>>>>>> "supportedPlanChanges" and "supportedSavepointChanges"
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Actually, I think we are mostly on the same page.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> The annotation does not need to be updated for every
> Flink
> > >>>>>>>>>>>> version. As
> > >>>>>>>>>>>>>>>>>> the name suggests it is about "Changes" (in other
> words:
> > >>>>>>>>>>>>>>>>>> incompatibilities) that require some kind of
> migration.
> > >>>>>> Either
> > >>>>>>>>> plan
> > >>>>>>>>>>>>>>>>>> migration (= PlanChanges) or savepoint migration
> > >>>>>>>>>>>> (=SavepointChanges,
> > >>>>>>>>>>>>>>>>>> using operator migration or savepoint migration).
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Let's assume we introduced two ExecNodes A and B in
> Flink
> > >>>>>> 1.15.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> The annotations are:
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> @ExecNodeMetadata(name=A, supportedPlanChanges=1.15,
> > >>>>>>>>>>>>>>>>>> supportedSavepointChanges=1.15)
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> @ExecNodeMetadata(name=B, supportedPlanChanges=1.15,
> > >>>>>>>>>>>>>>>>>> supportedSavepointChanges=1.15)
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> We change an operator state of B in Flink 1.16.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> We perform the change in the operator of B in a way to
> > >>>>>> support
> > >>>>>>>>> both
> > >>>>>>>>>>>>>>>>>> state layouts. Thus, no need for a new ExecNode
> version.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> The annotations in 1.16 are:
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> @ExecNodeMetadata(name=A, supportedPlanChanges=1.15,
> > >>>>>>>>>>>>>>>>>> supportedSavepointChanges=1.15)
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> @ExecNodeMetadata(name=B, supportedPlanChanges=1.15,
> > >>>>>>>>>>>>>>>>>> supportedSavepointChanges=1.15, 1.16)
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> So the versions in the annotations are "start
> version"s.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> I don't think we need end versions? End version would
> mean
> > >>>>>> that
> > >>>>>>>>> we
> > >>>>>>>>>>>> drop
> > >>>>>>>>>>>>>>>>>> the ExecNode from the code base?
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Please check the section 10.1.1 again. I added a more
> > >>>> complex
> > >>>>>>>>>>>> example.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>>>>>>>> Timo
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> On 01.12.21 16:29, Timo Walther wrote:
> > >>>>>>>>>>>>>>>>>>> Response to Francesco's feedback:
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>         > *Proposed changes #6*: Other than defining
> this
> > >>>> rule
> > >>>>>> of
> > >>>>>>>>>>>> thumb, we
> > >>>>>>>>>>>>>>>>>>> must also make sure that compiling plans with these
> > >> objects
> > >>>>>> that
> > >>>>>>>>>>>>>>>>>>> cannot be serialized in the plan must fail hard
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Yes, I totally agree. We will fail hard with a
> helpful
> > >>>>>>>>> exception.
> > >>>>>>>>>>>> Any
> > >>>>>>>>>>>>>>>>>>> mistake e.g. using a inline object in Table API or an
> > >>>>>> invalid
> > >>>>>>>>>>>>>>>>>>> DataStream API source without uid should immediately
> > >> fail a
> > >>>>>> plan
> > >>>>>>>>>>>>>>>>>>> compilation step. I added a remark to the FLIP again.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>         > What worries me is breaking changes, in
> > >> particular
> > >>>>>>>>>>>> behavioural
> > >>>>>>>>>>>>>>>>>>> changes that might happen in connectors/formats
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Breaking changes in connectors and formats need to be
> > >>>>>> encoded in
> > >>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>> options. I could also imagine to versioning in the
> > >> factory
> > >>>>>>>>>>>> identifier
> > >>>>>>>>>>>>>>>>>>> `connector=kafka` and `connector=kafka-2`. If this is
> > >>>>>> necessary.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> After thinking about your question again, I think we
> will
> > >>>>>> also
> > >>>>>>>>>>>> need
> > >>>>>>>>>>>>>>>>>>> the same testing infrastructure for our connectors
> and
> > >>>>>> formats.
> > >>>>>>>>>>>> Esp.
> > >>>>>>>>>>>>>>>>>>> restore tests and completeness test. I updated the
> > >> document
> > >>>>>>>>>>>>>>>>>>> accordingly. Also I added a way to generate UIDs for
> > >>>>>> DataStream
> > >>>>>>>>>>>> API
> > >>>>>>>>>>>>>>>>>>> providers.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>         > *Functions:* Are we talking about the
> function
> > >> name
> > >>>>>> or the
> > >>>>>>>>>>>>>>>>>>> function
> > >>>>>>>>>>>>>>>>>>> complete signature?
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> For catalog functions, the identifier contains
> catalog
> > >> name
> > >>>>>> and
> > >>>>>>>>>>>>>>>>>>> database name. For system functions, identifier
> contains
> > >>>>>> only a
> > >>>>>>>>>>>> name
> > >>>>>>>>>>>>>>>>>>> which make function name and identifier identical. I
> > >>>>>> reworked
> > >>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>> section again and also fixed some of the naming
> conflicts
> > >>>>>> you
> > >>>>>>>>>>>>>>>>>>> mentioned.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>         > we should perhaps use a logically defined
> > >> unique id
> > >>>>>> like
> > >>>>>>>>>>>>>>>>>>> /bigIntToTimestamp/
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> I added a concrete example for the resolution and
> > >>>>>> restoration.
> > >>>>>>>>> The
> > >>>>>>>>>>>>>>>>>>> unique id is composed of name + version. Internally,
> this
> > >>>> is
> > >>>>>>>>>>>>>>>>>>> represented as `$TO_TIMESTAMP_LTZ$1`.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>         > I think we should rather keep JSON out of
> the
> > >>>> concept
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Sounds ok to me. In SQL we also just call it "plan".
> I
> > >> will
> > >>>>>>>>>>>> change the
> > >>>>>>>>>>>>>>>>>>> file sections. But would suggest to keep the
> > >> fromJsonString
> > >>>>>>>>>>>> method.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>         > write it back in the original plan file
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> I updated the terminology section for what we
> consider an
> > >>>>>>>>>>>> "upgrade".
> > >>>>>>>>>>>>>>>>>>> We might need to update the orginal plan file. This
> is
> > >>>>>> already
> > >>>>>>>>>>>>>>>>>>> considered in the COMPILE PLAN ... FROM ... even
> though
> > >>>>>> this is
> > >>>>>>>>>>>> future
> > >>>>>>>>>>>>>>>>>>> work. Also savepoint migration.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Thanks for all the feedback!
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Timo
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> On 30.11.21 14:28, Timo Walther wrote:
> > >>>>>>>>>>>>>>>>>>>> Response to Wenlongs's feedback:
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>         > I would prefer not to provide such a
> shortcut,
> > >> let
> > >>>>>> users
> > >>>>>>>>> use
> > >>>>>>>>>>>>>>>>>>>> COMPILE PLAN IF NOT EXISTS and EXECUTE explicitly,
> which
> > >>>>>> can be
> > >>>>>>>>>>>>>>>>>>>> understood by new users even without inferring the
> docs.
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> I would like to hear more opinions on this topic.
> > >>>>>> Personally, I
> > >>>>>>>>>>>> find
> > >>>>>>>>>>>>>>>>>>>> a combined statement very useful. Not only for
> quicker
> > >>>>>>>>>>>> development
> > >>>>>>>>>>>>>>>>>>>> and debugging but also for readability. It helps in
> > >>>>>> keeping the
> > >>>>>>>>>>>> JSON
> > >>>>>>>>>>>>>>>>>>>> path and the query close to each other in order to
> know
> > >>>> the
> > >>>>>>>>>>>> origin of
> > >>>>>>>>>>>>>>>>>>>> the plan.
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>         > but the plan and SQL are not matched. The
> > >> result
> > >>>>>> would be
> > >>>>>>>>>>>> quite
> > >>>>>>>>>>>>>>>>>>>> confusing if we still execute the plan directly, we
> may
> > >>>>>> need to
> > >>>>>>>>>>>> add a
> > >>>>>>>>>>>>>>>>>>>> validation.
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> You are right that there could be a mismatch. But we
> > >> have
> > >>>> a
> > >>>>>>>>>>>> similar
> > >>>>>>>>>>>>>>>>>>>> problem when executing CREATE TABLE IF NOT EXISTS.
> The
> > >>>>>> schema
> > >>>>>>>>> or
> > >>>>>>>>>>>>>>>>>>>> options of a table could have changed completely in
> the
> > >>>>>> catalog
> > >>>>>>>>>>>> but
> > >>>>>>>>>>>>>>>>>>>> the CREATE TABLE IF NOT EXISTS is not executed
> again.
> > >> So a
> > >>>>>>>>>>>> mismatch
> > >>>>>>>>>>>>>>>>>>>> could also occur there.
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>>>>>>>> Timo
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> On 30.11.21 14:17, Timo Walther wrote:
> > >>>>>>>>>>>>>>>>>>>>> Hi everyone,
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> thanks for the feedback so far. Let me answer each
> > >> email
> > >>>>>>>>>>>>>>>>>>>>> indvidually.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> I will start with a response to Ingo's feedback:
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>         > Will the JSON plan's schema be
> considered an
> > >> API?
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> No, not in the first version. This is explicitly
> > >>>>>> mentioned in
> > >>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>>> `General JSON Plan Assumptions`. I tried to
> improve the
> > >>>>>>>>> section
> > >>>>>>>>>>>> once
> > >>>>>>>>>>>>>>>>>>>>> more to make it clearer. However, the JSON plan is
> > >>>>>> definitely
> > >>>>>>>>>>>> stable
> > >>>>>>>>>>>>>>>>>>>>> per minor version. And since the plan is versioned
> by
> > >>>>>> Flink
> > >>>>>>>>>>>> version,
> > >>>>>>>>>>>>>>>>>>>>> external tooling could be build around it. We might
> > >> make
> > >>>>>> it
> > >>>>>>>>>>>> public
> > >>>>>>>>>>>>>>>>>>>>> API once the design has settled.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>         > Given that upgrades across multiple
> versions
> > >> at
> > >>>>>> once are
> > >>>>>>>>>>>>>>>>>>>>> unsupported, do we verify this somehow?
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Good question. I extended the `General JSON Plan
> > >>>>>> Assumptions`.
> > >>>>>>>>>>>> Now
> > >>>>>>>>>>>>>>>>>>>>> yes: the Flink version is part of the JSON plan and
> > >> will
> > >>>>>> be
> > >>>>>>>>>>>> verified
> > >>>>>>>>>>>>>>>>>>>>> during restore. But keep in mind that we might
> support
> > >>>>>> more
> > >>>>>>>>> that
> > >>>>>>>>>>>>>>>>>>>>> just the last version at least until the JSON plan
> has
> > >>>>>> been
> > >>>>>>>>>>>>>>>>>>>>> migrated.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>>>>>>>>> Timo
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> On 30.11.21 09:39, Marios Trivyzas wrote:
> > >>>>>>>>>>>>>>>>>>>>>> I have a question regarding the `COMPILE PLAN
> > >> OVEWRITE`.
> > >>>>>> If
> > >>>>>>>>> we
> > >>>>>>>>>>>>>>>>>>>>>> choose to go
> > >>>>>>>>>>>>>>>>>>>>>> with the config option instead,
> > >>>>>>>>>>>>>>>>>>>>>> that doesn't provide the flexibility to overwrite
> > >>>> certain
> > >>>>>>>>>>>> plans but
> > >>>>>>>>>>>>>>>>>>>>>> not
> > >>>>>>>>>>>>>>>>>>>>>> others, since the config applies globally, isn't
> that
> > >>>>>>>>>>>>>>>>>>>>>> something to consider?
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> On Mon, Nov 29, 2021 at 10:15 AM Marios Trivyzas <
> > >>>>>>>>>>>> [email protected]>
> > >>>>>>>>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> Hi Timo!
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> Thanks a lot for taking all that time and effort
> to
> > >> put
> > >>>>>>>>>>>> together
> > >>>>>>>>>>>>>>>>> this
> > >>>>>>>>>>>>>>>>>>>>>>> proposal!
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> Regarding:
> > >>>>>>>>>>>>>>>>>>>>>>>> For simplification of the design, we assume that
> > >>>>>> upgrades
> > >>>>>>>>>>>> use a
> > >>>>>>>>>>>>>>>>>>>>>>>> step size
> > >>>>>>>>>>>>>>>>>>>>>>> of a single
> > >>>>>>>>>>>>>>>>>>>>>>> minor version. We don't guarantee skipping minor
> > >>>>>> versions
> > >>>>>>>>>>>> (e.g.
> > >>>>>>>>>>>>>>>>>>>>>>> 1.11 to
> > >>>>>>>>>>>>>>>>>>>>>>> 1.14).
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> I think that for this first step we should make
> it
> > >>>>>>>>> absolutely
> > >>>>>>>>>>>>>>>>>>>>>>> clear to the
> > >>>>>>>>>>>>>>>>>>>>>>> users that they would need to go through all
> > >>>>>> intermediate
> > >>>>>>>>>>>> versions
> > >>>>>>>>>>>>>>>>>>>>>>> to end up with the target version they wish. If
> we
> > >> are
> > >>>>>> to
> > >>>>>>>>>>>> support
> > >>>>>>>>>>>>>>>>>>>>>>> skipping
> > >>>>>>>>>>>>>>>>>>>>>>> versions in the future, i.e. upgrade from 1.14 to
> > >> 1.17,
> > >>>>>> this
> > >>>>>>>>>>>> means
> > >>>>>>>>>>>>>>>>>>>>>>> that we need to have a testing infrastructure in
> > >> place
> > >>>>>> that
> > >>>>>>>>>>>> would
> > >>>>>>>>>>>>>>>>>>>>>>> test all
> > >>>>>>>>>>>>>>>>>>>>>>> possible combinations of version upgrades, i.e.
> from
> > >>>>>> 1.14 to
> > >>>>>>>>>>>> 1.15,
> > >>>>>>>>>>>>>>>>>>>>>>> from 1.14 to 1.16 and so forth, while still
> testing
> > >> and
> > >>>>>> of
> > >>>>>>>>>>>> course
> > >>>>>>>>>>>>>>>>>>>>>>> supporting all the upgrades from the previous
> minor
> > >>>>>> version.
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> I like a lot the idea of introducing HINTS to
> define
> > >>>>>> some
> > >>>>>>>>>>>>>>>>>>>>>>> behaviour in the
> > >>>>>>>>>>>>>>>>>>>>>>> programs!
> > >>>>>>>>>>>>>>>>>>>>>>> - the hints live together with the sql
> statements and
> > >>>>>>>>>>>> consequently
> > >>>>>>>>>>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>>>>> (JSON) plans.
> > >>>>>>>>>>>>>>>>>>>>>>> - If multiple queries are involved in a program,
> each
> > >>>>>> one of
> > >>>>>>>>>>>> them
> > >>>>>>>>>>>>>>>>> can
> > >>>>>>>>>>>>>>>>>>>>>>> define its own config (regarding plan
> optimisation,
> > >> not
> > >>>>>> null
> > >>>>>>>>>>>>>>>>>>>>>>> enforcement,
> > >>>>>>>>>>>>>>>>>>>>>>> etc)
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> I agree with Francesco on his argument regarding
> the
> > >>>>>> *JSON*
> > >>>>>>>>>>>>>>>>>>>>>>> plan. I
> > >>>>>>>>>>>>>>>>>>>>>>> believe we should already provide flexibility
> here,
> > >>>>>> since
> > >>>>>>>>> (who
> > >>>>>>>>>>>>>>>>>>>>>>> knows) in
> > >>>>>>>>>>>>>>>>>>>>>>> the future
> > >>>>>>>>>>>>>>>>>>>>>>> a JSON plan might not fulfil the desired
> > >> functionality.
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> I also agree that we need some very obvious way
> (i.e.
> > >>>>>> not
> > >>>>>>>>> log
> > >>>>>>>>>>>>>>>>>>>>>>> entry) to
> > >>>>>>>>>>>>>>>>>>>>>>> show the users that their program doesn't support
> > >>>>>> version
> > >>>>>>>>>>>>>>>>>>>>>>> upgrades, and
> > >>>>>>>>>>>>>>>>>>>>>>> prevent them from being negatively surprised in
> the
> > >>>>>> future,
> > >>>>>>>>>>>> when
> > >>>>>>>>>>>>>>>>>>>>>>> trying to
> > >>>>>>>>>>>>>>>>>>>>>>> upgrade their production pipelines.
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> This is an implementation detail, but I'd like
> to add
> > >>>>>> that
> > >>>>>>>>>>>> there
> > >>>>>>>>>>>>>>>>>>>>>>> should be
> > >>>>>>>>>>>>>>>>>>>>>>> some good logging in place when the upgrade is
> taking
> > >>>>>> place,
> > >>>>>>>>>>>> to be
> > >>>>>>>>>>>>>>>>>>>>>>> able to track every restoration action, and help
> > >> debug
> > >>>>>> any
> > >>>>>>>>>>>>>>>>>>>>>>> potential
> > >>>>>>>>>>>>>>>>>>>>>>> issues arising from that.
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> On Fri, Nov 26, 2021 at 2:54 PM Till Rohrmann
> > >>>>>>>>>>>>>>>>>>>>>>> <[email protected]
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> Thanks for writing this FLIP Timo. I think this
> will
> > >>>>>> be a
> > >>>>>>>>>>>> very
> > >>>>>>>>>>>>>>>>>>>>>>>> important
> > >>>>>>>>>>>>>>>>>>>>>>>> improvement for Flink and our SQL user :-)
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> Similar to Francesco I would like to understand
> the
> > >>>>>>>>> statement
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> For simplification of the design, we assume
> that
> > >>>>>> upgrades
> > >>>>>>>>>>>> use a
> > >>>>>>>>>>>>>>>>>>>>>>>>> step
> > >>>>>>>>>>>>>>>>>>>>>>>> size
> > >>>>>>>>>>>>>>>>>>>>>>>> of a single
> > >>>>>>>>>>>>>>>>>>>>>>>> minor version. We don't guarantee skipping minor
> > >>>>>> versions
> > >>>>>>>>>>>> (e.g.
> > >>>>>>>>>>>>>>>>>>>>>>>> 1.11 to
> > >>>>>>>>>>>>>>>>>>>>>>>> 1.14).
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> a bit better. Is it because Flink does not
> guarantee
> > >>>>>> that a
> > >>>>>>>>>>>>>>>>>>>>>>>> savepoint
> > >>>>>>>>>>>>>>>>>>>>>>>> created by version 1.x can be directly
> recovered by
> > >>>>>> version
> > >>>>>>>>>>>> 1.y
> > >>>>>>>>>>>>>>>>>>>>>>>> with x + 1
> > >>>>>>>>>>>>>>>>>>>>>>>> < y but users might have to go through a
> cascade of
> > >>>>>>>>> upgrades?
> > >>>>>>>>>>>>>>>>>>>>>>>>        From how I
> > >>>>>>>>>>>>>>>>>>>>>>>> understand your proposal, the compiled plan
> won't be
> > >>>>>>>>> changed
> > >>>>>>>>>>>>>>>>>>>>>>>> after being
> > >>>>>>>>>>>>>>>>>>>>>>>> written initially. Hence, I would assume that
> for
> > >> the
> > >>>>>> plan
> > >>>>>>>>>>>> alone
> > >>>>>>>>>>>>>>>>>>>>>>>> Flink
> > >>>>>>>>>>>>>>>>>>>>>>>> will
> > >>>>>>>>>>>>>>>>>>>>>>>> have to give backwards compatibility guarantees
> for
> > >>>> all
> > >>>>>>>>>>>> versions.
> > >>>>>>>>>>>>>>>>>>>>>>>> Am I
> > >>>>>>>>>>>>>>>>>>>>>>>> understanding this part correctly?
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> Cheers,
> > >>>>>>>>>>>>>>>>>>>>>>>> Till
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> On Thu, Nov 25, 2021 at 4:55 PM Francesco
> Guardiani
> > >> <
> > >>>>>>>>>>>>>>>>>>>>>>>> [email protected]>
> > >>>>>>>>>>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> Hi Timo,
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> Thanks for putting this amazing work together,
> I
> > >> have
> > >>>>>> some
> > >>>>>>>>>>>>>>>>>>>>>>>>> considerations/questions
> > >>>>>>>>>>>>>>>>>>>>>>>>> about the FLIP:
> > >>>>>>>>>>>>>>>>>>>>>>>>> *Proposed changes #6*: Other than defining this
> > >> rule
> > >>>>>> of
> > >>>>>>>>>>>> thumb,
> > >>>>>>>>>>>>>>>>>>>>>>>>> we must
> > >>>>>>>>>>>>>>>>>>>>>>>>> also make sure
> > >>>>>>>>>>>>>>>>>>>>>>>>> that compiling plans with these objects that
> cannot
> > >>>> be
> > >>>>>>>>>>>>>>>>>>>>>>>>> serialized in the
> > >>>>>>>>>>>>>>>>>>>>>>>>> plan must fail hard,
> > >>>>>>>>>>>>>>>>>>>>>>>>> so users don't bite themselves with such
> issues, or
> > >>>> at
> > >>>>>>>>>>>> least we
> > >>>>>>>>>>>>>>>>>>>>>>>>> need to
> > >>>>>>>>>>>>>>>>>>>>>>>>> output warning
> > >>>>>>>>>>>>>>>>>>>>>>>>> logs. In general, whenever the user is trying
> to
> > >> use
> > >>>>>> the
> > >>>>>>>>>>>>>>>>>>>>>>>>> CompiledPlan
> > >>>>>>>>>>>>>>>>>>>>>>>> APIs
> > >>>>>>>>>>>>>>>>>>>>>>>>> and at the same
> > >>>>>>>>>>>>>>>>>>>>>>>>> time, they're trying to do something "illegal"
> for
> > >>>> the
> > >>>>>>>>>>>> plan, we
> > >>>>>>>>>>>>>>>>>>>>>>>>> should
> > >>>>>>>>>>>>>>>>>>>>>>>>> immediately either
> > >>>>>>>>>>>>>>>>>>>>>>>>> log or fail depending on the issue, in order to
> > >> avoid
> > >>>>>> any
> > >>>>>>>>>>>>>>>>>>>>>>>>> surprises once
> > >>>>>>>>>>>>>>>>>>>>>>>>> the user upgrades.
> > >>>>>>>>>>>>>>>>>>>>>>>>> I would also say the same for things like
> > >> registering
> > >>>>>> a
> > >>>>>>>>>>>>>>>>>>>>>>>>> function,
> > >>>>>>>>>>>>>>>>>>>>>>>>> registering a DataStream,
> > >>>>>>>>>>>>>>>>>>>>>>>>> and for every other thing which won't end up
> in the
> > >>>>>> plan,
> > >>>>>>>>> we
> > >>>>>>>>>>>>>>>>>>>>>>>>> should log
> > >>>>>>>>>>>>>>>>>>>>>>>>> such info to the
> > >>>>>>>>>>>>>>>>>>>>>>>>> user by default.
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> *General JSON Plan Assumptions #9:* When
> thinking
> > >> to
> > >>>>>>>>>>>> connectors
> > >>>>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>>>>>>>>>> formats, I think
> > >>>>>>>>>>>>>>>>>>>>>>>>> it's reasonable to assume and keep out of the
> > >> feature
> > >>>>>>>>> design
> > >>>>>>>>>>>>>>>>>>>>>>>>> that no
> > >>>>>>>>>>>>>>>>>>>>>>>>> feature/ability can
> > >>>>>>>>>>>>>>>>>>>>>>>>> deleted from a connector/format. I also don't
> think
> > >>>>>> new
> > >>>>>>>>>>>>>>>>>>>>>>>> features/abilities
> > >>>>>>>>>>>>>>>>>>>>>>>>> can influence
> > >>>>>>>>>>>>>>>>>>>>>>>>> this FLIP as well, given the plan is static,
> so if
> > >>>> for
> > >>>>>>>>>>>> example,
> > >>>>>>>>>>>>>>>>>>>>>>>>> MyCoolTableSink in the next
> > >>>>>>>>>>>>>>>>>>>>>>>>> flink version implements
> > >> SupportsProjectionsPushDown,
> > >>>>>> then
> > >>>>>>>>>>>> it
> > >>>>>>>>>>>>>>>>>>>>>>>>> shouldn't
> > >>>>>>>>>>>>>>>>>>>>>>>> be
> > >>>>>>>>>>>>>>>>>>>>>>>>> a problem
> > >>>>>>>>>>>>>>>>>>>>>>>>> for the upgrade story since the plan is still
> > >>>>>> configured
> > >>>>>>>>> as
> > >>>>>>>>>>>>>>>>>>>>>>>>> computed
> > >>>>>>>>>>>>>>>>>>>>>>>> from
> > >>>>>>>>>>>>>>>>>>>>>>>>> the previous flink
> > >>>>>>>>>>>>>>>>>>>>>>>>> version. What worries me is breaking changes,
> in
> > >>>>>>>>> particular
> > >>>>>>>>>>>>>>>>>>>>>>>>> behavioural
> > >>>>>>>>>>>>>>>>>>>>>>>>> changes that
> > >>>>>>>>>>>>>>>>>>>>>>>>> might happen in connectors/formats. Although
> this
> > >>>>>> argument
> > >>>>>>>>>>>>>>>>>>>>>>>>> doesn't seem
> > >>>>>>>>>>>>>>>>>>>>>>>>> relevant for
> > >>>>>>>>>>>>>>>>>>>>>>>>> the connectors shipped by the flink project
> itself,
> > >>>>>>>>> because
> > >>>>>>>>>>>> we
> > >>>>>>>>>>>>>>>>>>>>>>>>> try to
> > >>>>>>>>>>>>>>>>>>>>>>>> keep
> > >>>>>>>>>>>>>>>>>>>>>>>>> them as stable as
> > >>>>>>>>>>>>>>>>>>>>>>>>> possible and avoid eventual breaking changes,
> it's
> > >>>>>>>>>>>> compelling to
> > >>>>>>>>>>>>>>>>>>>>>>>> external
> > >>>>>>>>>>>>>>>>>>>>>>>>> connectors and
> > >>>>>>>>>>>>>>>>>>>>>>>>> formats, which might be decoupled from the
> flink
> > >>>>>> release
> > >>>>>>>>>>>> cycle
> > >>>>>>>>>>>>>>>>>>>>>>>>> and might
> > >>>>>>>>>>>>>>>>>>>>>>>>> have different
> > >>>>>>>>>>>>>>>>>>>>>>>>> backward compatibility guarantees. It's totally
> > >>>>>> reasonable
> > >>>>>>>>>>>> if we
> > >>>>>>>>>>>>>>>>>>>>>>>>> don't
> > >>>>>>>>>>>>>>>>>>>>>>>>> want to tackle it in
> > >>>>>>>>>>>>>>>>>>>>>>>>> this first iteration of the feature, but it's
> > >>>>>> something we
> > >>>>>>>>>>>> need
> > >>>>>>>>>>>>>>>>>>>>>>>>> to keep
> > >>>>>>>>>>>>>>>>>>>>>>>> in
> > >>>>>>>>>>>>>>>>>>>>>>>>> mind for the future.
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> *Functions:* It's not clear to me what you
> mean for
> > >>>>>>>>>>>>>>>>>>>>>>>>> "identifier",
> > >>>>>>>>>>>>>>>>>>>>>>>> because
> > >>>>>>>>>>>>>>>>>>>>>>>>> then somewhere
> > >>>>>>>>>>>>>>>>>>>>>>>>> else in the same context you talk about
> "name". Are
> > >>>> we
> > >>>>>>>>>>>> talking
> > >>>>>>>>>>>>>>>>>>>>>>>>> about the
> > >>>>>>>>>>>>>>>>>>>>>>>>> function name
> > >>>>>>>>>>>>>>>>>>>>>>>>> or the function complete signature? Let's
> assume
> > >> for
> > >>>>>>>>>>>> example we
> > >>>>>>>>>>>>>>>>>>>>>>>>> have
> > >>>>>>>>>>>>>>>>>>>>>>>> these
> > >>>>>>>>>>>>>>>>>>>>>>>>> function
> > >>>>>>>>>>>>>>>>>>>>>>>>> definitions:
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> * TO_TIMESTAMP_LTZ(BIGINT)
> > >>>>>>>>>>>>>>>>>>>>>>>>> * TO_TIMESTAMP_LTZ(STRING)
> > >>>>>>>>>>>>>>>>>>>>>>>>> * TO_TIMESTAMP_LTZ(STRING, STRING)
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> These for me are very different functions with
> > >>>>>> different
> > >>>>>>>>>>>>>>>>>>>>>>>> implementations,
> > >>>>>>>>>>>>>>>>>>>>>>>>> where each of
> > >>>>>>>>>>>>>>>>>>>>>>>>> them might evolve separately at a different
> pace.
> > >>>>>> Hence
> > >>>>>>>>>>>> when we
> > >>>>>>>>>>>>>>>>>>>>>>>>> store
> > >>>>>>>>>>>>>>>>>>>>>>>> them
> > >>>>>>>>>>>>>>>>>>>>>>>>> in the json
> > >>>>>>>>>>>>>>>>>>>>>>>>> plan we should perhaps use a logically defined
> > >> unique
> > >>>>>> id
> > >>>>>>>>>>>> like
> > >>>>>>>>>>>>>>>>>>>>>>>>> /bigIntToTimestamp/, /
> > >>>>>>>>>>>>>>>>>>>>>>>>> stringToTimestamp/ and
> > >> /stringToTimestampWithFormat/.
> > >>>>>> This
> > >>>>>>>>>>>> also
> > >>>>>>>>>>>>>>>>>>>>>>>>> solves
> > >>>>>>>>>>>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>>>>>>> issue of
> > >>>>>>>>>>>>>>>>>>>>>>>>> correctly referencing the functions when
> restoring
> > >>>> the
> > >>>>>>>>> plan,
> > >>>>>>>>>>>>>>>>>>>>>>>>> without
> > >>>>>>>>>>>>>>>>>>>>>>>>> running again the
> > >>>>>>>>>>>>>>>>>>>>>>>>> inference logic (which might have been changed
> in
> > >> the
> > >>>>>>>>>>>> meantime)
> > >>>>>>>>>>>>>>>>>>>>>>>>> and it
> > >>>>>>>>>>>>>>>>>>>>>>>>> might also solve
> > >>>>>>>>>>>>>>>>>>>>>>>>> the versioning, that is the function
> identifier can
> > >>>>>>>>> contain
> > >>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>>>>>>> function
> > >>>>>>>>>>>>>>>>>>>>>>>>> version like /
> > >>>>>>>>>>>>>>>>>>>>>>>>> stringToTimestampWithFormat_1_1 /or
> > >>>>>>>>>>>>>>>>>>>>>>>>> /stringToTimestampWithFormat_1_2/.
> > >>>>>>>>>>>>>>>>>>>>>>>> An
> > >>>>>>>>>>>>>>>>>>>>>>>>> alternative could be to use the string
> signature
> > >>>>>>>>>>>> representation,
> > >>>>>>>>>>>>>>>>>>>>>>>>> which
> > >>>>>>>>>>>>>>>>>>>>>>>>> might not be trivial
> > >>>>>>>>>>>>>>>>>>>>>>>>> to compute, given the complexity of our type
> > >>>> inference
> > >>>>>>>>>>>> logic.
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> *The term "JSON plan"*: I think we should
> rather
> > >> keep
> > >>>>>> JSON
> > >>>>>>>>>>>> out
> > >>>>>>>>>>>>>>>>>>>>>>>>> of the
> > >>>>>>>>>>>>>>>>>>>>>>>>> concept and just
> > >>>>>>>>>>>>>>>>>>>>>>>>> name it "Compiled Plan" (like the proposed
> API) or
> > >>>>>>>>> something
> > >>>>>>>>>>>>>>>>>>>>>>>>> similar,
> > >>>>>>>>>>>>>>>>>>>>>>>> as I
> > >>>>>>>>>>>>>>>>>>>>>>>>> see how in
> > >>>>>>>>>>>>>>>>>>>>>>>>> future we might decide to support/modify our
> > >>>>>> persistence
> > >>>>>>>>>>>>>>>>>>>>>>>>> format to
> > >>>>>>>>>>>>>>>>>>>>>>>>> something more
> > >>>>>>>>>>>>>>>>>>>>>>>>> efficient storage wise like BSON. For example,
> I
> > >>>> would
> > >>>>>>>>>>>> rename /
> > >>>>>>>>>>>>>>>>>>>>>>>>> CompiledPlan.fromJsonFile/ to simply
> > >>>>>>>>>>>> /CompiledPlan.fromFile/.
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> *Who is the owner of the plan file?* I asked
> myself
> > >>>>>> this
> > >>>>>>>>>>>>>>>>>>>>>>>>> question when
> > >>>>>>>>>>>>>>>>>>>>>>>>> reading this:
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>> For simplification of the design, we assume
> that
> > >>>>>> upgrades
> > >>>>>>>>>>>> use a
> > >>>>>>>>>>>>>>>>>>>>>>>>>> step
> > >>>>>>>>>>>>>>>>>>>>>>>>> size of a single
> > >>>>>>>>>>>>>>>>>>>>>>>>> minor version. We don't guarantee skipping
> minor
> > >>>>>> versions
> > >>>>>>>>>>>> (e.g.
> > >>>>>>>>>>>>>>>>>>>>>>>>> 1.11 to
> > >>>>>>>>>>>>>>>>>>>>>>>>> 1.14).
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> My understanding of this statement is that a
> user
> > >> can
> > >>>>>>>>>>>> upgrade
> > >>>>>>>>>>>>>>>>>>>>>>>>> between
> > >>>>>>>>>>>>>>>>>>>>>>>>> minors but then
> > >>>>>>>>>>>>>>>>>>>>>>>>> following all the minors, the same query can
> remain
> > >>>>>> up and
> > >>>>>>>>>>>>>>>>> running.
> > >>>>>>>>>>>>>>>>>>>>>>>> E.g. I
> > >>>>>>>>>>>>>>>>>>>>>>>>> upgrade from
> > >>>>>>>>>>>>>>>>>>>>>>>>> 1.15 to 1.16, and then from 1.16 to 1.17 and I
> > >> still
> > >>>>>>>>> expect
> > >>>>>>>>>>>> my
> > >>>>>>>>>>>>>>>>>>>>>>>>> original
> > >>>>>>>>>>>>>>>>>>>>>>>>> query to work
> > >>>>>>>>>>>>>>>>>>>>>>>>> without recomputing the plan. This necessarily
> > >> means
> > >>>>>> that
> > >>>>>>>>> at
> > >>>>>>>>>>>>>>>>>>>>>>>>> some point
> > >>>>>>>>>>>>>>>>>>>>>>>> in
> > >>>>>>>>>>>>>>>>>>>>>>>>> future
> > >>>>>>>>>>>>>>>>>>>>>>>>> releases we'll need some basic "migration"
> tool to
> > >>>>>> keep
> > >>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>>>>>>> queries up
> > >>>>>>>>>>>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>>>>>>>>>> running,
> > >>>>>>>>>>>>>>>>>>>>>>>>> ending up modifying the compiled plan. So I
> guess
> > >>>>>> flink
> > >>>>>>>>>>>> should
> > >>>>>>>>>>>>>>>>>>>>>>>>> write it
> > >>>>>>>>>>>>>>>>>>>>>>>>> back in the original
> > >>>>>>>>>>>>>>>>>>>>>>>>> plan file, perhaps doing a backup of the
> previous
> > >>>>>> one? Can
> > >>>>>>>>>>>> you
> > >>>>>>>>>>>>>>>>>>>>>>>>> please
> > >>>>>>>>>>>>>>>>>>>>>>>>> clarify this aspect?
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> Except these considerations, the proposal looks
> > >> good
> > >>>>>> to me
> > >>>>>>>>>>>>>>>>>>>>>>>>> and I'm
> > >>>>>>>>>>>>>>>>>>>>>>>> eagerly
> > >>>>>>>>>>>>>>>>>>>>>>>>> waiting to see
> > >>>>>>>>>>>>>>>>>>>>>>>>> it in play.
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>>>>>>>>>>>>>>> FG
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>>>>>>>>>>>> Francesco Guardiani | Software Engineer
> > >>>>>>>>>>>>>>>>>>>>>>>>> [email protected][1]
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> Follow us @VervericaData
> > >>>>>>>>>>>>>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>>>>>>>>>>>> Join Flink Forward[2] - The Apache Flink
> Conference
> > >>>>>>>>>>>>>>>>>>>>>>>>> Stream Processing | Event Driven | Real Time
> > >>>>>>>>>>>>>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>>>>>>>>>>>> Ververica GmbH | Invalidenstrasse 115, 10115
> > >> Berlin,
> > >>>>>>>>> Germany
> > >>>>>>>>>>>>>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>>>>>>>>>>>> Ververica GmbH
> > >>>>>>>>>>>>>>>>>>>>>>>>> Registered at Amtsgericht Charlottenburg: HRB
> > >> 158244
> > >>>> B
> > >>>>>>>>>>>>>>>>>>>>>>>>> Managing Directors: Karl Anton Wehner, Holger
> > >> Temme,
> > >>>>>> Yip
> > >>>>>>>>>>>> Park
> > >>>>>>>>>>>>>>>>>>>>>>>>> Tung
> > >>>>>>>>>>>>>>>>>>>>>>>> Jason,
> > >>>>>>>>>>>>>>>>>>>>>>>>> Jinwei (Kevin)
> > >>>>>>>>>>>>>>>>>>>>>>>>> Zhang
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> --------
> > >>>>>>>>>>>>>>>>>>>>>>>>> [1] mailto:[email protected]
> > >>>>>>>>>>>>>>>>>>>>>>>>> [2] https://flink-forward.org/
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>>>>>>>>>> Marios
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>>
> > >>>
> > >>
> > >>
> > >
> >
>

Re: [DISCUSS] FLIP-190: Support Version Upgrades for Table API & SQL Programs

Reply via email to