Hey Hangxiang, Thanks for driving this issue. I've read through all the discussions and suggestions in this thread, and here is my take:
1. I agree that the compatibility check should be done in the opposite direction. The current interface *causes some real issues* for users using their own `TypeSerializer`, and we should help resolve this issue. 2. The new opposite interface would better still stay within the class `TypeSerializerSnapshot`. I am suggesting this mostly considering the designed contract between `TypeSerializerSnapshot` and `TypeSerializer`. I do not think it is necessary to break the current contract because of this change. 3. It seems to me that "breaking changes" are unavoidable in each of the strategies proposed. However, *I'd be very concerned and worried *to make a breaking change *in the very first step*. A more user-friendly way is to make sure of compatibility and mark the method to drop as `@Deprecated`. After several versions of adoption, carefully and publicly discussing with users to drop the method. By compatibilities, I mean the following things: 1). upgrading from a lower Flink version to the new version does not need the user to do extra work for the change. 2). In the new version, both the old and the new method should be supported, and of course, we would encourage users to use the new method. For example, dropping "TypeSerializerConfigSnapshot" started to take into action after almost 10 releases the time it was marked deprecated. Due to the above reason, I am hesitant to squash this FLIP with https://lists.apache.org/thread/v1q28zg5jhxcqrpq67pyv291nznd3n0w 4. What do I suggest doing? *STEP 1: In `TypeSerializerSnapshot`, we do something like this, and keep everything else exactly the same* @Deprecated default TypeSerializerSchemaCompatibility<T> resolveSchemaCompatibility( TypeSerializer<T> newSerializer) { // use new method return newSerializer.snapshotConfiguration(). resolveSerializerSchemaCompatibility(this.restoreSerializer()); } default TypeSerializerSchemaCompatibility<T> resolveSerializerSchemaCompatibility( TypeSerializer<T> oldSerializer) { return INCOMPATIBLE; } *STEP 2: (This is a breaking change), after several releases* 1. Remove the deprecated method 2. Substitute every usage with the new method 3. Make the new method not have a default implementation STEP1 is compatible, and STEP2 is a breaking change. Best Yuan On Fri, Oct 21, 2022 at 10:01 AM godfrey he <godfre...@gmail.com> wrote: > Hi Hangxiang, Dawid, > > I also prefer to add method into TypeSerializerSnapshot, which looks > more natural. TypeSerializerSnapshot has `Version` concept, which also > can be used for compatibility. > ` > TypeSerializerSnapshot { > > TypeSerializerSchemaCompatibility<T> resolveSchemaCompatibility( > TypeSerializerSnapshot<T> oldSnapshot); > > } > ` > > Best, > Godfrey > > Dawid Wysakowicz <dwysakow...@apache.org> 于2022年10月20日周四 16:59写道: > > > > That's the final status we will arrive at. > > > > IIUC, we cannot just remove the original method but just mark it as > deprecated so two methods have to exist together in the short term. > > > > Users may also need to migrate their own external serializers which is a > long run. > > > > I'd like to make jobs where users could always work without modifying > any codes before removing the deprecated method so I provide a default > implementation in the proposal. > > > > > > I understand it works for old serializers, but it makes writing new > serializers more complicated. It's not as simple as extending an interface > and implementing all the methods it tells you to. You have to dig into > documentation and check that you must implement one of the methods (either > the old or the new one). Otherwise you end up in an ifinite loop. > > > > > > If we break the compatibility, yes we cause some annoyance, but I feel > it's a safer option. BTW, there is a proposal around serializer that also > causes some incompatibilities so we could squash this together: > https://lists.apache.org/thread/v1q28zg5jhxcqrpq67pyv291nznd3n0w > > > > > > I don't want to force that, but would be really happy to hear what > others think. > > > > > > Best, > > > > Dawid > > > > On 19/10/2022 04:05, Hangxiang Yu wrote: > > > > Hi, Dawid. > > > > > > Thanks for your reply. > > > > Should we introduce the new method to the TypeSerializerSnapshot instead? > > > > You provided a valuable option, let me try to list the benefit and cost. > > > > benefit: > > > > TypeSerializerSnapshot still owns the responsibility of resolving schema > compatibility so TypeSerializer could just pay attention to its > serialization and deserialization as before. > > It's very convenient to implement it based on current implementation by > all information in TypeSerializerSnapshot and tools which is also helpful > for users to migrate their external serializer. > > > > cost: > > > > The interface is a bit strange because we have to construct and use the > snapshot of the new serializer to check the compatibility. > > > > I appreciate this because users could migrate more conveniently. > > > > For its cost, We could see TypeSerializerSnapshot as a point-in-time > view of TypeSerializer used for forward compatibility checks and persisting > serializer within checkpoints. It seems reasonable to generate a > point-in-time view of the serializer when we need to resolve compatibility. > > > > > > I'd actually be fine with breaking some external serializers and not > provide a default implementation of the new method. > > > > That's the final status we will arrive at. > > > > IIUC, we cannot just remove the original method but just mark it as > deprecated so two methods have to exist together in the short term. > > > > Users may also need to migrate their own external serializers which is a > long run. > > > > I'd like to make jobs where users could always work without modifying > any codes before removing the deprecated method so I provide a default > implementation in the proposal. > > > > > > On Mon, Oct 17, 2022 at 10:10 PM Dawid Wysakowicz < > dwysakow...@apache.org> wrote: > >> > >> Hi Han, > >> > >> I think in principle your proposal makes sense and the compatibility > >> check indeed should be done in the opposite direction. > >> > >> However, I have two suggestions: > >> > >> 1. Should we introduce the new method to the TypeSerializerSnapshot > >> instead? E.g. > >> > >> TypeSerializerSnapshot { > >> > >> TypeSerializerSchemaCompatibility<T> resolveSchemaCompatibility( > >> TypeSerializerSnapshot<T> oldSnapshot); > >> > >> } > >> > >> I know it has the downside that we'd need to create a snapshot for the > >> new serializer during a restore, but the there is a lot of tooling and > >> subclasses in the *Snapshot stack that it won't be possible to migrate > >> to TypeSerializer stack. I have classes such as > >> CompositeTypeSerializerSnapshot and alike in mind. > >> > >> 2. I'd actually be fine with breaking some external serializers and not > >> provide a default implementation of the new method. If we don't do that > >> we will have two methods with default implementation which call each > >> other. It makes it also a bit harder to figure out which methods are > >> mandatory to be implemented going forward. > >> > >> What do you think? > >> > >> Best, > >> > >> Dawid > >> > >> On 17/10/2022 04:43, Hangxiang Yu wrote: > >> > Hi, Han. > >> > > >> > Both the old method and the new method can get previous and new inner > >> > information. > >> > > >> > The new serializer will decide it just like the old serializer did > before. > >> > > >> > The method just specify the schema compatibility result so that other > >> > behaviours is same as before. > >> > > >> > On Fri, Oct 14, 2022 at 11:40 AM Han Yin <alexyin...@gmail.com> > wrote: > >> > > >> >> Hi Hangxiang, > >> >> > >> >> Thanks for the proposal. It seems more reasonable to let the new > >> >> serializer claim the compatibility in the cases you mentioned. > >> >> > >> >> I have but one question here. What happens in the case of > >> >> “compatibleAfterMigration” after we completely reverse the direction > (in > >> >> step 3)? To be specific, migration from an old schema calls for the > >> >> previous serializer to read bytes into state objects. How should a > new > >> >> serializer decide whether the migration is possible? > >> >> > >> >> Best, > >> >> Han > >> >> > >> >> On 2022/10/12 12:41:07 Hangxiang Yu wrote: > >> >>> Dear Flink developers, > >> >>> > >> >>> I would like to start a discussion thread on FLIP-263[1] proposing > to > >> >>> improve the usability of resolving schema compatibility. > >> >>> > >> >>> Currently, the place for compatibility checks is > >> >>> TypeSerializerSnapshot#resolveSchemaCompatibility > >> >>> which belongs to the old serializer, There are no ways for users to > >> >> specify the > >> >>> compatibility with the old serializer in the new customized > serializer. > >> >>> > >> >>> The FLIP hopes to reverse the direction of resolving schema > compatibility > >> >>> to improve the usability of resolving schema compatibility. > >> >>> > >> >>> [1] > >> >>> > >> >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-263%3A+Improve+resolving+schema+compatibility > >> > > >> > > > > > > > > > -- > > Best, > > Hangxiang. >