[jira] [Resolved] (FLINK-36750) Paimon connector would reuse sequence number when schema evolution happened

Ruan Hang (Jira) Wed, 20 Nov 2024 03:05:45 -0800


     [ 
https://issues.apache.org/jira/browse/FLINK-36750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Ruan Hang resolved FLINK-36750.
-------------------------------
    Resolution: Fixed

> Paimon connector would reuse sequence number when schema evolution happened
> ---------------------------------------------------------------------------
>
>                 Key: FLINK-36750
>                 URL: https://issues.apache.org/jira/browse/FLINK-36750
>             Project: Flink
>          Issue Type: Improvement
>          Components: Flink CDC
>    Affects Versions: cdc-3.2.0
>            Reporter: Yanquan Lv
>            Assignee: Yanquan Lv
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: cdc-3.3.0, cdc-3.2.1
>
>         Attachments: image-2024-11-20-13-00-58-282.png, 
> image-2024-11-20-13-02-47-612.png, image-2024-11-20-13-04-53-635.png
>
>
> When schema evolution happened, we will prepare commit and recreate a new 
> FileStoreWrite to obtain the latest schema. However, FileStoreWrite maintain 
> some information like sequence number in memory, we can't directly remove and 
> recreate one FileStoreWrite, instead, we should extract the information of 
> Write and rebuild with this information.
> The  sequence number is used to determine the order of data with two 
> identical primary keys, If we don't strictly maintain this order, it may lead 
> to unexpected situations.
> The following picture show The problem we are currently facing：
> 1) Schema evolution happened between the second and third 
> files(`{*}schema_id{*}` changed)
> !image-2024-11-20-13-04-53-635.png!
> 2）The expected sequence number here should be increasing, however, there is 
> an overlap of `{*}min_sequence_number{*}` between the third file and the 
> second file.
> !image-2024-11-20-13-02-47-612.png!
> Due to the confusion of sequence numbers, we may read the data of 
> update-before.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (FLINK-36750) Paimon connector would reuse sequence number when schema evolution happened

Reply via email to