Sure, Thanks a lot for the reminder ! On Fri, Oct 28, 2022 at 3:59 PM Dawid Wysakowicz <dwysakow...@apache.org> wrote:
> I see the majority of the community prefers the not immediate breaking > of the API. Even though I have different preference I am more than fine > to commit to the others choice. Just please make sure it is prominently > documented that one or the other methods MUST be implemented in any new > serializer. > > Best, > > Dawid > > On 27/10/2022 08:33, Yun Gao wrote: > > Hi, > > I have discussed offline with Hangxiang and Yuan, and the current > > proposal also looks good to me. > > Thanks Hangxiang for driving this. > > Best, > > Yun Gao > > ------------------------------------------------------------------ > > From:Yuan Mei <yuanmei.w...@gmail.com> > > Send Time:2022 Oct. 26 (Wed.) 12:20 > > To:dev <dev@flink.apache.org> > > Subject:Re: [DISCUSS] FLIP-263: Improve resolving schema compatibility > > Hey Huangxiang, > > The section of `Rejected Alternatives` may also need an update. > > Current plan sounds like a reasonable one. I am fine with it. > > Thanks for driving this. > > Best > > Yuan > > On Tue, Oct 25, 2022 at 5:11 PM Hangxiang Yu <master...@gmail.com> > wrote: > >> (Resend the mail to fix the format issue) > >> Hi, everyone. > >> > >> Thanks for your suggestions! > >> > >> Let me summarize the remaining questions in the thread and share my > ideas > >> based on your suggestions: > >> > >> > >> 1. Should we put the new opposite interface in TypeSerializer or > >> TypeSerializerSnapshot ? > >> > >> Just as I replied to Dawid, I'd like to put it in > >> TypeSerializerSnapshot so that we could still follow the contract > between > >> two classes and make later code migration easier based on current tools. > >> > >> Thanks Dawid for the initial suggestion and Godfrey for the additional > >> supplement. > >> > >> > >> > >> 2. Do we just break changes and make user codes incompatible or make > sure > >> compatibility using a more suitable migration plan ? > >> > >> I agree with Yuan that we should make sure that user jobs still work > >> without modifying any codes before removing the deprecated method. > >> > >> Thanks Yuan for the migration plan. Let me try to add something to the > >> suggestion of Yuan: > >> > >> > >> a. In Step 1, I prefer to make the new interface like: > >> > >> > >> default TypeSerializerSchemaCompatibility<T> > >> resolveSchemaCompatibility( > >> // Use 'oldSnapshot' not 'oldSerializer' > >> TypeSerializerSnapshot<T> oldSnapshot) { > >> return INCOMPATIBLE; > >> } > >> } > >> > >> I think using 'oldSnapshot' as the parameter will make the > >> logic clear --- TypeSerializerSnapshot will take all responsibility for > >> compatibility checks. > >> BTW, It's also easy to migrate original check logic to this > >> interface. > >> > >> b. In Step 1, In addition to introducing default implementations > >> for both interfaces, we also need to implement the new interface in all > >> inner TypeSerializerSnapshots. > >> Users may implement their own serializers based on inner > >> serializers, we should make sure that the new interface of inner > >> TypeSerializerSnapshots is usable. > >> > >> Then I think it could work for both old custom serializers or new > >> custom serializers. > >> No matter which interface the user implements, it could always work. > >> Of course, we will deprecate the old interface and encourage users to > >> use the new one. > >> > >> > >> 3. Do we need to squash this with > >> https://lists.apache.org/thread/v1q28zg5jhxcqrpq67pyv291nznd3n0w < > https://lists.apache.org/thread/v1q28zg5jhxcqrpq67pyv291nznd3n0w > ? > >> We will not break the compatibility based on 2, so it's not necessary > >> to squash them together. > >> > >> > >> Do you have any other suggestions ? Look forward to your reply! > >> > >> On Tue, Oct 25, 2022 at 1:31 PM Hangxiang Yu <master...@gmail.com> > wrote: > >> > >>> Hi, everyone. > >>> > >>> Thanks for your suggestions! > >>> > >>> Let me summarize the remaining questions in the thread and share my > ideas > >>> based on your suggestions: > >>> > >>> 1. Should we put the new opposite interface in TypeSerializer or > >>> TypeSerializerSnapshot ? > >>> > >>> Just as I replied to Dawid, I'd like to put it in > TypeSerializerSnapshot > >>> so that we could still follow the contract between two classes and make > >>> later code migration easier based on current tools. > >>> > >>> Thanks Dawid for the initial suggestion and Godfrey for the additional > >>> supplement. > >>> > >>> > >>> > >>> 1. Do we just break changes and make user codes incompatible or make > >>> sure compatibility using a more suitable migration plan ? > >>> > >>> I agree with Yuan that we should make sure that user jobs still work > >>> without modifying any codes before removing the deprecated method. > >>> > >>> Thanks Yuan for the migration plan. Let me try to add something to the > >>> suggestion of Yuan: > >>> > >>> 1. In Step 1, I prefer to make the new interface like: > >>> > >>> default TypeSerializerSchemaCompatibility<T> > resolveSchemaCompatibility( > >>> > >>> // Use 'oldSnapshot' not 'oldSerializer' > >>> TypeSerializerSnapshot<T> oldSnapshot) { > >>> > >>> return INCOMPATIBLE; > >>> > >>> } > >>> > >>> I think using 'oldSnapshot' as the parameter will make the logic clear > >> --- > >>> TypeSerializerSnapshot will take all responsibility for compatibility > >>> checks. > >>> > >>> BTW, It's also easy to migrate original check logic to this interface. > >>> > >>> 1. In Step 1, In addition to introducing default implementations for > >>> both interfaces, we also need to implement the new interface in > >> all inner > >>> TypeSerializerSnapshots. > >>> > >>> Users may implement their own serializers based on inner serializers, > we > >>> should make sure that the new interface of inner > TypeSerializerSnapshots > >> is > >>> usable. > >>> > >>> > >>> Then I think it could work for both old custom serializers or new > custom > >>> serializers. > >>> > >>> No matter which interface the user implements, it could always work. > >>> > >>> Of course, we will deprecate the old interface and encourage users to > use > >>> the new one. > >>> > >>> > >>> > >>> 1. Do we need to squash this with > >>> https://lists.apache.org/thread/v1q28zg5jhxcqrpq67pyv291nznd3n0w < > https://lists.apache.org/thread/v1q28zg5jhxcqrpq67pyv291nznd3n0w > ? > >>> > >>> We will not break the compatibility based on 2, so it's not necessary > to > >>> squash them together. > >>> > >>> > >>> Do you have any other suggestions ? Look forward to your reply! > >>> > >>> On Mon, Oct 24, 2022 at 8:30 PM Yuan Mei <yuanmei.w...@gmail.com> > wrote: > >>> > >>>> Hey Hangxiang, > >>>> > >>>> Thanks for driving this issue. I've read through all the discussions > and > >>>> suggestions in this thread, and here is my take: > >>>> > >>>> 1. I agree that the compatibility check should be done in the opposite > >>>> direction. > >>>> The current interface *causes some real issues* for users using > >>>> their own `TypeSerializer`, and we should help resolve this issue. > >>>> > >>>> 2. The new opposite interface would better still stay within the class > >>>> `TypeSerializerSnapshot`. > >>>> I am suggesting this mostly considering the designed contract > >> between > >>>> `TypeSerializerSnapshot` and `TypeSerializer`. I do not think it is > >>>> necessary to break the current contract because of this change. > >>>> > >>>> 3. It seems to me that "breaking changes" are unavoidable in each of > the > >>>> strategies proposed. However, *I'd be very concerned and worried *to > >>>> make a breaking change *in the very first step*. > >>>> A more user-friendly way is to make sure of compatibility and mark > >>>> the method to drop as `@Deprecated`. After several versions of > adoption, > >>>> carefully and publicly discussing with users to drop the method. By > >>>> compatibilities, I mean the following things: > >>>> 1). upgrading from a lower Flink version to the new version does not > >>>> need the user to do extra work for the change. > >>>> 2). In the new version, both the old and the new method should be > >>>> supported, and of course, we would encourage users to use the new > >> method. > >>>> For example, dropping "TypeSerializerConfigSnapshot" started to take > >>>> into action after almost 10 releases the time it was marked > deprecated. > >>>> > >>>> Due to the above reason, I am hesitant to squash this FLIP with > >>>> https://lists.apache.org/thread/v1q28zg5jhxcqrpq67pyv291nznd3n0w < > https://lists.apache.org/thread/v1q28zg5jhxcqrpq67pyv291nznd3n0w > > >>>> > >>>> 4. What do I suggest doing? > >>>> > >>>> *STEP 1: In `TypeSerializerSnapshot`, we do something like this, and > >>>> keep everything else exactly the same* > >>>> > >>>> @Deprecated > >>>> default TypeSerializerSchemaCompatibility<T> > >>>> resolveSchemaCompatibility( > >>>> TypeSerializer<T> newSerializer) { > >>>> // use new method > >>>> return newSerializer.snapshotConfiguration(). > >>>> > >>>> resolveSerializerSchemaCompatibility(this.restoreSerializer()); > >>>> > >>>> } > >>>> > >>>> default TypeSerializerSchemaCompatibility<T> > >>>> resolveSerializerSchemaCompatibility( > >>>> TypeSerializer<T> oldSerializer) { > >>>> return INCOMPATIBLE; > >>>> } > >>>> > >>>> *STEP 2: (This is a breaking change), after several releases* > >>>> 1. Remove the deprecated method > >>>> 2. Substitute every usage with the new method > >>>> 3. Make the new method not have a default implementation > >>>> > >>>> STEP1 is compatible, and STEP2 is a breaking change. > >>>> > >>>> Best > >>>> Yuan > >>>> > >>>> > >>>> > >>>> On Fri, Oct 21, 2022 at 10:01 AM godfrey he <godfre...@gmail.com> > >> wrote: > >>>>> Hi Hangxiang, Dawid, > >>>>> > >>>>> I also prefer to add method into TypeSerializerSnapshot, which looks > >>>>> more natural. TypeSerializerSnapshot has `Version` concept, which > also > >>>>> can be used for compatibility. > >>>>> ` > >>>>> TypeSerializerSnapshot { > >>>>> > >>>>> TypeSerializerSchemaCompatibility<T> > >> resolveSchemaCompatibility( > >>>>> TypeSerializerSnapshot<T> oldSnapshot); > >>>>> > >>>>> } > >>>>> ` > >>>>> > >>>>> Best, > >>>>> Godfrey > >>>>> > >>>>> Dawid Wysakowicz <dwysakow...@apache.org> 于2022年10月20日周四 16:59写道: > >>>>>> That's the final status we will arrive at. > >>>>>> > >>>>>> IIUC, we cannot just remove the original method but just mark it as > >>>>> deprecated so two methods have to exist together in the short term. > >>>>>> Users may also need to migrate their own external serializers which > >> is > >>>>> a long run. > >>>>>> I'd like to make jobs where users could always work without > modifying > >>>>> any codes before removing the deprecated method so I provide a > default > >>>>> implementation in the proposal. > >>>>>> > >>>>>> I understand it works for old serializers, but it makes writing new > >>>>> serializers more complicated. It's not as simple as extending an > >> interface > >>>>> and implementing all the methods it tells you to. You have to dig > into > >>>>> documentation and check that you must implement one of the methods > >> (either > >>>>> the old or the new one). Otherwise you end up in an ifinite loop. > >>>>>> > >>>>>> If we break the compatibility, yes we cause some annoyance, but I > >> feel > >>>>> it's a safer option. BTW, there is a proposal around serializer that > >> also > >>>>> causes some incompatibilities so we could squash this together: > >>>>> https://lists.apache.org/thread/v1q28zg5jhxcqrpq67pyv291nznd3n0w < > https://lists.apache.org/thread/v1q28zg5jhxcqrpq67pyv291nznd3n0w > > >>>>>> > >>>>>> I don't want to force that, but would be really happy to hear what > >>>>> others think. > >>>>>> > >>>>>> Best, > >>>>>> > >>>>>> Dawid > >>>>>> > >>>>>> On 19/10/2022 04:05, Hangxiang Yu wrote: > >>>>>> > >>>>>> Hi, Dawid. > >>>>>> > >>>>>> > >>>>>> Thanks for your reply. > >>>>>> > >>>>>> Should we introduce the new method to the TypeSerializerSnapshot > >>>>> instead? > >>>>>> You provided a valuable option, let me try to list the benefit and > >>>>> cost. > >>>>>> benefit: > >>>>>> > >>>>>> TypeSerializerSnapshot still owns the responsibility of resolving > >>>>> schema compatibility so TypeSerializer could just pay attention to > its > >>>>> serialization and deserialization as before. > >>>>>> It's very convenient to implement it based on current implementation > >>>>> by all information in TypeSerializerSnapshot and tools which is also > >>>>> helpful for users to migrate their external serializer. > >>>>>> cost: > >>>>>> > >>>>>> The interface is a bit strange because we have to construct and use > >>>>> the snapshot of the new serializer to check the compatibility. > >>>>>> I appreciate this because users could migrate more conveniently. > >>>>>> > >>>>>> For its cost, We could see TypeSerializerSnapshot as a point-in-time > >>>>> view of TypeSerializer used for forward compatibility checks and > >> persisting > >>>>> serializer within checkpoints. It seems reasonable to generate a > >>>>> point-in-time view of the serializer when we need to resolve > >> compatibility. > >>>>>> > >>>>>> I'd actually be fine with breaking some external serializers and not > >>>>> provide a default implementation of the new method. > >>>>>> That's the final status we will arrive at. > >>>>>> > >>>>>> IIUC, we cannot just remove the original method but just mark it as > >>>>> deprecated so two methods have to exist together in the short term. > >>>>>> Users may also need to migrate their own external serializers which > >> is > >>>>> a long run. > >>>>>> I'd like to make jobs where users could always work without > modifying > >>>>> any codes before removing the deprecated method so I provide a > default > >>>>> implementation in the proposal. > >>>>>> > >>>>>> On Mon, Oct 17, 2022 at 10:10 PM Dawid Wysakowicz < > >>>>> dwysakow...@apache.org> wrote: > >>>>>>> Hi Han, > >>>>>>> > >>>>>>> I think in principle your proposal makes sense and the > compatibility > >>>>>>> check indeed should be done in the opposite direction. > >>>>>>> > >>>>>>> However, I have two suggestions: > >>>>>>> > >>>>>>> 1. Should we introduce the new method to the TypeSerializerSnapshot > >>>>>>> instead? E.g. > >>>>>>> > >>>>>>> TypeSerializerSnapshot { > >>>>>>> > >>>>>>> TypeSerializerSchemaCompatibility<T> > >>>>> resolveSchemaCompatibility( > >>>>>>> TypeSerializerSnapshot<T> oldSnapshot); > >>>>>>> > >>>>>>> } > >>>>>>> > >>>>>>> I know it has the downside that we'd need to create a snapshot for > >> the > >>>>>>> new serializer during a restore, but the there is a lot of tooling > >> and > >>>>>>> subclasses in the *Snapshot stack that it won't be possible to > >> migrate > >>>>>>> to TypeSerializer stack. I have classes such as > >>>>>>> CompositeTypeSerializerSnapshot and alike in mind. > >>>>>>> > >>>>>>> 2. I'd actually be fine with breaking some external serializers and > >>>>> not > >>>>>>> provide a default implementation of the new method. If we don't do > >>>>> that > >>>>>>> we will have two methods with default implementation which call > each > >>>>>>> other. It makes it also a bit harder to figure out which methods > are > >>>>>>> mandatory to be implemented going forward. > >>>>>>> > >>>>>>> What do you think? > >>>>>>> > >>>>>>> Best, > >>>>>>> > >>>>>>> Dawid > >>>>>>> > >>>>>>> On 17/10/2022 04:43, Hangxiang Yu wrote: > >>>>>>>> Hi, Han. > >>>>>>>> > >>>>>>>> Both the old method and the new method can get previous and new > >>>>> inner > >>>>>>>> information. > >>>>>>>> > >>>>>>>> The new serializer will decide it just like the old serializer did > >>>>> before. > >>>>>>>> The method just specify the schema compatibility result so that > >>>>> other > >>>>>>>> behaviours is same as before. > >>>>>>>> > >>>>>>>> On Fri, Oct 14, 2022 at 11:40 AM Han Yin <alexyin...@gmail.com> > >>>>> wrote: > >>>>>>>>> Hi Hangxiang, > >>>>>>>>> > >>>>>>>>> Thanks for the proposal. It seems more reasonable to let the new > >>>>>>>>> serializer claim the compatibility in the cases you mentioned. > >>>>>>>>> > >>>>>>>>> I have but one question here. What happens in the case of > >>>>>>>>> “compatibleAfterMigration” after we completely reverse the > >>>>> direction (in > >>>>>>>>> step 3)? To be specific, migration from an old schema calls for > >>>>> the > >>>>>>>>> previous serializer to read bytes into state objects. How should > >> a > >>>>> new > >>>>>>>>> serializer decide whether the migration is possible? > >>>>>>>>> > >>>>>>>>> Best, > >>>>>>>>> Han > >>>>>>>>> > >>>>>>>>> On 2022/10/12 12:41:07 Hangxiang Yu wrote: > >>>>>>>>>> Dear Flink developers, > >>>>>>>>>> > >>>>>>>>>> I would like to start a discussion thread on FLIP-263[1] > >>>>> proposing to > >>>>>>>>>> improve the usability of resolving schema compatibility. > >>>>>>>>>> > >>>>>>>>>> Currently, the place for compatibility checks is > >>>>>>>>>> TypeSerializerSnapshot#resolveSchemaCompatibility > >>>>>>>>>> which belongs to the old serializer, There are no ways for users > >>>>> to > >>>>>>>>> specify the > >>>>>>>>>> compatibility with the old serializer in the new customized > >>>>> serializer. > >>>>>>>>>> The FLIP hopes to reverse the direction of resolving schema > >>>>> compatibility > >>>>>>>>>> to improve the usability of resolving schema compatibility. > >>>>>>>>>> > >>>>>>>>>> [1] > >>>>>>>>>> > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-263%3A+Improve+resolving+schema+compatibility > < > https://cwiki.apache.org/confluence/display/FLINK/FLIP-263%3A+Improve+resolving+schema+compatibility > > > >>>>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> Best, > >>>>>> Hangxiang. > >>> -- > >>> Best, > >>> Hangxiang. > >>> > >> > >> -- > >> Best, > >> Hangxiang. > >> > -- Best, Hangxiang.