That's the final status we will arrive at.

   IIUC, we cannot just remove the original method but just mark it as
   deprecated so two methods have to exist together in the short term.

   Users may also need to migrate their own external serializers which
   is a long run.

   I'd like to make jobs where users could always work without
   modifying any codes before removing the deprecated method so I
   provide a default implementation in the proposal.


I understand it works for old serializers, but it makes writing new serializers more complicated. It's not as simple as extending an interface and implementing all the methods it tells you to. You have to dig into documentation and check that you must implement one of the methods (either the old or the new one). Otherwise you end up in an ifinite loop.


If we break the compatibility, yes we cause some annoyance, but I feel it's a safer option. BTW, there is a proposal around serializer that also causes some incompatibilities so we could squash this together: https://lists.apache.org/thread/v1q28zg5jhxcqrpq67pyv291nznd3n0w


I don't want to force that, but would be really happy to hear what others think.


Best,

Dawid

On 19/10/2022 04:05, Hangxiang Yu wrote:

Hi, Dawid.


Thanks for your reply.

Should we introduce the new method to the TypeSerializerSnapshotinstead?

You provided a valuable option, let me try to list the benefit and cost.

benefit:

 1. TypeSerializerSnapshot still owns the responsibility of resolving
    schema compatibility so TypeSerializer could just pay attention to
    its serialization and deserialization as before.
 2. It's very convenient to implement it based on current
    implementation by all information in TypeSerializerSnapshot and
    tools which is also helpful for users to migrate their external
    serializer.

cost:

 1. The interface is a bit strange because we have to construct and
    use the snapshot of the new serializer to check the compatibility.

I appreciate this because users could migrate more conveniently.

For its cost, We could see TypeSerializerSnapshot as a point-in-time view of TypeSerializer used for forward compatibility checks and persisting serializer within checkpoints. It seems reasonable to generate a point-in-time view of the serializer when we need to resolve compatibility.


I'd actually be fine with breaking some external serializers and notprovide a default implementation of the new method.

That's the final status we will arrive at.

IIUC, we cannot just remove the original method but just mark it as deprecated so two methods have to exist together in the short term.

Users may also need to migrate their own external serializers which is a long run.

I'd like to make jobs where users could always work without modifying any codes before removing the deprecated method so I provide a default implementation in the proposal.


On Mon, Oct 17, 2022 at 10:10 PM Dawid Wysakowicz <dwysakow...@apache.org> wrote:

    Hi Han,

    I think in principle your proposal makes sense and the compatibility
    check indeed should be done in the opposite direction.

    However, I have two suggestions:

    1. Should we introduce the new method to the TypeSerializerSnapshot
    instead? E.g.

        TypeSerializerSnapshot {

            TypeSerializerSchemaCompatibility<T>
    resolveSchemaCompatibility(
                 TypeSerializerSnapshot<T> oldSnapshot);

        }

    I know it has the downside that we'd need to create a snapshot for
    the
    new serializer during a restore, but the there is a lot of tooling
    and
    subclasses in the *Snapshot stack that it won't be possible to
    migrate
    to TypeSerializer stack. I have classes such as
    CompositeTypeSerializerSnapshot and alike in mind.

    2. I'd actually be fine with breaking some external serializers
    and not
    provide a default implementation of the new method. If we don't do
    that
    we will have two methods with default implementation which call each
    other. It makes it also a bit harder to figure out which methods are
    mandatory to be implemented going forward.

    What do you think?

    Best,

    Dawid

    On 17/10/2022 04:43, Hangxiang Yu wrote:
    > Hi, Han.
    >
    > Both the old method and the new method can get previous and new
    inner
    > information.
    >
    > The new serializer will decide it just like the old serializer
    did before.
    >
    > The method just specify the schema compatibility result so that
    other
    > behaviours is same as before.
    >
    > On Fri, Oct 14, 2022 at 11:40 AM Han Yin <alexyin...@gmail.com>
    wrote:
    >
    >> Hi Hangxiang,
    >>
    >> Thanks for the proposal. It seems more reasonable to let the new
    >> serializer claim the compatibility in the cases you mentioned.
    >>
    >> I have but one question here. What happens in the case of
    >> “compatibleAfterMigration” after we completely reverse the
    direction (in
    >> step 3)?  To be specific, migration from an old schema calls
    for the
    >> previous serializer to read bytes into state objects. How
    should a new
    >> serializer decide whether the migration is possible?
    >>
    >> Best,
    >> Han
    >>
    >> On 2022/10/12 12:41:07 Hangxiang Yu wrote:
    >>> Dear Flink developers,
    >>>
    >>> I would like to start a discussion thread on FLIP-263[1]
    proposing to
    >>> improve the usability of resolving schema compatibility.
    >>>
    >>> Currently, the place for compatibility checks is
    >>> TypeSerializerSnapshot#resolveSchemaCompatibility
    >>> which belongs to the old serializer, There are no ways for
    users to
    >> specify the
    >>> compatibility with the old serializer in the new customized
    serializer.
    >>>
    >>> The FLIP hopes to reverse the direction of resolving schema
    compatibility
    >>> to improve the usability of resolving schema compatibility.
    >>>
    >>> [1]
    >>>
    >>
    
https://cwiki.apache.org/confluence/display/FLINK/FLIP-263%3A+Improve+resolving+schema+compatibility
    >
    >



--
Best,
Hangxiang.

Attachment: OpenPGP_0x31D2DD10BFC15A2D.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature

Reply via email to