mielientiev commented on PR #3845:
URL: https://github.com/apache/flink-cdc/pull/3845#issuecomment-3165205505
@lvyanquan The analysis is partially correct. MySQL servers can have gaps in
their executed transactions (e.g., `A:1-102,105-150`), and these gaps may or
may not get filled later. This isn't a problem.
However, there's something the analysis missed. When MySQL processes
transactions in parallel, they can appear out of order in the binlog. For
example, even though the server executed transactions `A:1-10`, the binlog
might show them like this when sending to Flink CDC:
```
-Tx, A:1
-Tx, A:2
-Tx, A:4
-Tx, A:6
-Tx, A:5
-Tx, A:3
-Tx, A:7
-Tx, A:8
-Tx, A:10
-Tx, A:9
```
Now imagine our application stopped and saved its state after processing
transaction `A:6`. The restored GTID set would be `A:1-2:4:6` (we've seen 1, 2,
4, and 6, but not 3 or 5 yet).
When we restart, we need to tell MySQL which transactions we've already
processed so it knows what to send next. The problem is that the current code
merges these gaps and produces `A:1-6` as restored value. This tells MySQL
we've processed ALL transactions from 1 to 6, including 3 and 5. MySQL then
skips transactions 3 and 5, causing data loss.
I hope it's clear now
Also @lzshlzsh did a great description for this issue here
https://issues.apache.org/jira/browse/FLINK-38183
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]