Hi, Weiqing

Thanks for the FLIP! In general I'd +1 for schema evolution for RowData
types, which will enhance the user experience of SQL jobs.

I have one questions for now:

You suggested introducing new methods in `TypeSerializerSnapshot`, but is
it possible to leverage existing state migration procedure[1], which also
performs deserialization and serialization with old and new serializer
correspondingly. IIUC, all we need is to properly implement
`resolveSchemaCompatibility` for `RowDataSerializerSnapshot`[2] since it
will be invoked here[3]. No need for new methods, right?


[1]
https://github.com/apache/flink/blob/f8b3c4b9a8ce1c6a094fcc0f292faea4bad8806c/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/state/rocksdb/AbstractRocksDBState.java#L201-L205
[2]
https://github.com/apache/flink/blob/f8b3c4b9a8ce1c6a094fcc0f292faea4bad8806c/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/typeutils/RowDataSerializer.java#L335
[3]
https://github.com/apache/flink/blob/f8b3c4b9a8ce1c6a094fcc0f292faea4bad8806c/flink-runtime/src/main/java/org/apache/flink/runtime/state/StateSerializerProvider.java#L312


Best,
Zakelly

On Sat, Apr 26, 2025 at 1:47 PM Weiqing Yang <yangweiqing...@gmail.com>
wrote:

> Hi all,
>
> I’d like to initiate a discussion about enhancing state schema evolution
> support for RowData in Flink.
>
> *Motivation*
>
> Flink applications frequently need to evolve their state schema as business
> requirements change. Currently, when users update a Table API or SQL job
> with schema changes involving RowData types (particularly nested
> structures), they encounter serialization compatibility errors during state
> restoration, causing job failures.The issue occurs because existing state
> migration mechanisms don't properly handle RowData types during schema
> evolution, preventing users from making backward-compatible changes like:
>
>    -
>
>    Adding nullable fields to existing structures
>    -
>
>    Reordering fields within a row while preserving field names
>    -
>
>    Evolving nested row structures
>
> This limitation impacts production applications using Flink's Table API, as
> the RowData type is central to this interface. Users are forced to choose
> between maintaining outdated schemas or reprocessing all state data when
> schema changes are required.
>
> Here’s the proposal document: Link
> <
> https://docs.google.com/document/d/1WtAxp-jAVTLMOfWNldLCAoK137P0ZCMxR8hOZGcMxuc/edit?tab=t.0
> >
> Your feedback and ideas are welcome to refine this feature.
>
> Thanks,
> Weiqing
>

Reply via email to