Hi Leonard,
Thank you for your reply. I have read the PR information, and it seems to solve the problem that BEGIN/END transaction boundaries cannot be captured when collecting MySQL data. I have indeed encountered this issue before, so your reply is very helpful to me. However, I am currently using another database. Now that I already have BEGIN/END transaction boundaries, I would like to know how to reasonably use BEGIN/END to ensure that I can completely process the data under one transaction, especially since this data comes from multiple tables. Thanks, Viktor At 2026-02-05 11:15:38, "Leonard Xu" <[email protected]> wrote: Hey Viktor Maybe this PR[1] would help your case. Best, Leonard [1] https://github.com/apache/flink-cdc/pull/4170 2026 2月 5 10:42,viktor <[email protected]> 写道: Hi there, I have been learning Flink-related technologies recently and currently using the combination of Flink CDC + Kafka + Flink. The specific process is as follows: Flink CDC collects change data from multiple database tables, then forwards it to Kafka, and Flink consumes the data for subsequent calculations. Now I encounter a problem: the collected multi-table data has strong associations, such as main tables and sub-tables. Their changes usually occur within a single transaction. My initial idea was to use Kafka's Partition and route data through the transactionid of CDC messages, ensuring that one consumer thread can process data from one transaction. This can solve ordinary scenarios. However, there is another case: data in a single table may change multiple times in a short period, and each change belongs to a different transaction. This causes the data to be consumed by multiple consumers, leading to a certain probability of order issues. Currently, the only solution I can think of is to use a single Partition to process all data, but this is too inefficient. Therefore, I would like to ask: Is there a flaw in my thinking? Are there any better solutions? And at which link should the solution be implemented? Thanks, Viktor
