Hey all, I'd love to see support for this! Coincidentally, I am just working on a PoC right now which uses the custom watermarks in the DataStream v2 API to represent transaction boundaries. It seems this is a great fit conceptually. In any case, it would be nice to not only support this for MySQL but also other DBs. Debezium provides that transaction metadata for a range of connectors, including Postgres.
--Gunnar On Mon, 3 Nov 2025 at 15:53, Tejansh Rana <[email protected]> wrote: > Hello, > > Following up on the below proposal. Would appreciate your thoughts and if > we could move forward with this feature. > > Thank you, > Tejansh > > From: Tejansh Rana <[email protected]> > Date: Friday, 17 October 2025 at 15:58 > To: [email protected] <[email protected]> > Subject: [PROPOSAL] Support for MySQL Transaction Boundary Events in Flink > CDC Connector > You don't often get email from [email protected]. Learn > why this is important<https://aka.ms/LearnAboutSenderIdentification> > > EXTERNAL EMAIL : Do not click any links or open any attachments unless you > trust the sender and know the content is safe. > > Hi team, > > Following my discussion with Leonard Xu at Flink Forward, I am writing to > propose a feature enhancement for the Flink MySQL CDC connector related to > how it handles transaction metadata from the MySQL binary log. > > Problem Statement: > In data streaming pipelines that require transactional guarantees or need > to group atomic changes together, it is essential to identify the > boundaries of the original database transaction (i.e., the BEGIN and COMMIT > or END events). Currently, the Flink MySQL CDC connector appears to skip > these transaction lifecycle events - > https://github.com/apache/flink-cdc/blob/23a1c2efb6fa9ce1c9f17b3836f6aaa995bb0660/flink-cdc-connect/flink-cdc-source-connectors/flink-connector-mysql-cdc/src/main/java/org/apache/flink/cdc/connectors/mysql/source/reader/MySqlRecordEmitter.java#L77 > . > I have also attached a screenshot of the logs from this behaviour. > > This omission makes it challenging to reconstruct the original transaction > scope. Without explicit transaction markers, downstream Flink jobs cannot > easily guarantee atomicity across sinks. > > Proposed Solution: > The underlying CDC mechanism, Debezium, supports emitting transaction > boundary events (BEGIN and END/COMMIT) through its configuration. > > We propose enhancing the Flink MySQL CDC connector to expose this > transaction metadata to the Flink pipeline. The connector should emit > specialised records or metadata fields that indicate the start and end of a > transaction as emitted. We would be happy to create a PR with this feature > if this proposal goes ahead. > > Thank you, > Tejansh > > >
