[ 
https://issues.apache.org/jira/browse/FLINK-36681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Runkang He updated FLINK-36681:
-------------------------------
    Description: 
The chunks splitting query in incremental snapshot reading section is wrong, 
see more at doc 
[link|https://nightlies.apache.org/flink/flink-cdc-docs-master/docs/connectors/flink-sources/mysql-cdc/#incremental-snapshot-reading].

Currently for other primary key column type, MySQL CDC Source executes the 
statement in the form of SELECT MAX(STR_ID) AS chunk_high FROM (SELECT * FROM 
TestTable WHERE STR_ID > 'uuid-001' limit 25) to get the low and high value for 
each chunk.

But this query misses the ordering of STR_ID, leading to wrong results.
The correct one is SELECT MAX(STR_ID) AS chunk_high FROM (SELECT * FROM 
TestTable WHERE STR_ID > 'uuid-001' *ORDER BY STR_ID ASC* limit 25)

  was:
The chunks splitting query in incremental snapshot reading section is wrong, 
see more at doc link.
Currently for other primary key column type, MySQL CDC Source executes the 
statement in the form of SELECT MAX(STR_ID) AS chunk_high FROM (SELECT * FROM 
TestTable WHERE STR_ID > 'uuid-001' limit 25) to get the low and high value for 
each chunk.
But this query misses the ordering of STR_ID , leading to wrong results.
The correct one is:SELECT MAX(STR_ID) AS chunk_high FROM (SELECT * FROM 
TestTable WHERE STR_ID > 'uuid-001' ORDER BY STR_ID ASC limit 25)


> Wrong chunks splitting query in incremental snapshot reading section in mysql 
> cdc doc
> -------------------------------------------------------------------------------------
>
>                 Key: FLINK-36681
>                 URL: https://issues.apache.org/jira/browse/FLINK-36681
>             Project: Flink
>          Issue Type: Bug
>          Components: Flink CDC
>    Affects Versions: cdc-3.3.0, cdc-3.2.1
>            Reporter: Runkang He
>            Priority: Minor
>
> The chunks splitting query in incremental snapshot reading section is wrong, 
> see more at doc 
> [link|https://nightlies.apache.org/flink/flink-cdc-docs-master/docs/connectors/flink-sources/mysql-cdc/#incremental-snapshot-reading].
> Currently for other primary key column type, MySQL CDC Source executes the 
> statement in the form of SELECT MAX(STR_ID) AS chunk_high FROM (SELECT * FROM 
> TestTable WHERE STR_ID > 'uuid-001' limit 25) to get the low and high value 
> for each chunk.
> But this query misses the ordering of STR_ID, leading to wrong results.
> The correct one is SELECT MAX(STR_ID) AS chunk_high FROM (SELECT * FROM 
> TestTable WHERE STR_ID > 'uuid-001' *ORDER BY STR_ID ASC* limit 25)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to