+1 fro Hang’s point to introduce a default false option to avoid possible performance regression and potential compatibility breaking.
Best, Leonard > 2025 11月 20 15:08,Hang Ruan <[email protected]> 写道: > > Hi, Tejansh > > Thanks for your work. > > I have some thoughts regarding this proposed change, and there are a > few points that need to be emphasized: > First, since Flink SQL/Table API relies on RowData, it cannot handle > transactional metadata. Therefore, this feature can only be supported > in the DataStream API. > Second, because parsing this transactional information may introduce > performance overhead, it would be best to add an option to control > whether this parsing is enabled or not—and by default, it should be > disabled. > As I understand it, the current proposal only includes changes to the > CDC connector and does not cover the MySQL pipeline connector. If we > decide to support this feature in the MySQL pipeline connector in the > future, we can discuss that separately in another JIRA ticket. > > Best, > Hang > > On Tue, Nov 18, 2025 at 5:13 PM Tejansh Rana > <[email protected]> wrote: >> >> Hello, >> >> Bumping this thread. >> I have created this Jira ticket describing the proposal and based on >> Gunnar’s feedback, I have also included the base Source Emitter which would >> cover this feature for connectors like Postgres - >> https://issues.apache.org/jira/browse/FLINK-38691 >> This is the draft PR with proposed changes for MySQL connector - >> https://github.com/apache/flink-cdc/pull/4170 >> >> I would appreciate some feedback on this proposal and I would be happy to >> contribute the feature. >> >> Thank you, >> Tejansh >> >> From: Tejansh Rana <[email protected]> >> Date: Tuesday, 4 November 2025 at 16:50 >> To: [email protected] <[email protected]> >> Subject: Re: [PROPOSAL] Support for MySQL Transaction Boundary Events in >> Flink CDC Connector >> [You don't often get email from [email protected]. Learn why >> this is important at https://aka.ms/LearnAboutSenderIdentification ] >> >> EXTERNAL EMAIL : Do not click any links or open any attachments unless you >> trust the sender and know the content is safe. >> >> >> Thank you, Gunnar! >> I have created a draft PR for the proposed feature - >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fflink-cdc%2Fpull%2F4170&data=05%7C02%7Ctejansh.rana%40autodesk.com%7Ccc84f50fbd34414f286408de1bc24231%7C67bff79e7f914433a8e5c9252d2ddc1d%7C0%7C0%7C638978718322279639%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=nJAvbS0TVL8FWX62mqD69pJyyb6PDzd3osZpPy9LawI%3D&reserved=0<https://github.com/apache/flink-cdc/pull/4170> >> Looking forward to hearing more feedback. >> >> Thank you, >> Tejansh >> >> From: Gunnar Morling <[email protected]> >> Date: Monday, 3 November 2025 at 17:19 >> To: [email protected] <[email protected]> >> Subject: Re: [PROPOSAL] Support for MySQL Transaction Boundary Events in >> Flink CDC Connector >> [You don't often get email from [email protected]. Learn >> why this is important at https://aka.ms/LearnAboutSenderIdentification ] >> >> EXTERNAL EMAIL : Do not click any links or open any attachments unless you >> trust the sender and know the content is safe. >> >> >> Hey all, >> >> I'd love to see support for this! Coincidentally, I am just working on a >> PoC right now which uses the custom watermarks in the DataStream v2 API to >> represent transaction boundaries. It seems this is a great fit >> conceptually. In any case, it would be nice to not only support this for >> MySQL but also other DBs. Debezium provides that transaction metadata for a >> range of connectors, including Postgres. >> >> --Gunnar >> >> >> On Mon, 3 Nov 2025 at 15:53, Tejansh Rana <[email protected]> >> wrote: >> >>> Hello, >>> >>> Following up on the below proposal. Would appreciate your thoughts and if >>> we could move forward with this feature. >>> >>> Thank you, >>> Tejansh >>> >>> From: Tejansh Rana <[email protected]> >>> Date: Friday, 17 October 2025 at 15:58 >>> To: [email protected] <[email protected]> >>> Subject: [PROPOSAL] Support for MySQL Transaction Boundary Events in Flink >>> CDC Connector >>> You don't often get email from [email protected]. Learn >>> why this is important<https://aka.ms/LearnAboutSenderIdentification> >>> >>> EXTERNAL EMAIL : Do not click any links or open any attachments unless you >>> trust the sender and know the content is safe. >>> >>> Hi team, >>> >>> Following my discussion with Leonard Xu at Flink Forward, I am writing to >>> propose a feature enhancement for the Flink MySQL CDC connector related to >>> how it handles transaction metadata from the MySQL binary log. >>> >>> Problem Statement: >>> In data streaming pipelines that require transactional guarantees or need >>> to group atomic changes together, it is essential to identify the >>> boundaries of the original database transaction (i.e., the BEGIN and COMMIT >>> or END events). Currently, the Flink MySQL CDC connector appears to skip >>> these transaction lifecycle events - >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fflink-cdc%2Fblob%2F23a1c2efb6fa9ce1c9f17b3836f6aaa995bb0660%2Fflink-cdc-connect%2Fflink-cdc-source-connectors%2Fflink-connector-mysql-cdc%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fflink%2Fcdc%2Fconnectors%2Fmysql%2Fsource%2Freader%2FMySqlRecordEmitter.java%23L77&data=05%7C02%7Ctejansh.rana%40autodesk.com%7Ccc84f50fbd34414f286408de1bc24231%7C67bff79e7f914433a8e5c9252d2ddc1d%7C0%7C0%7C638978718322297418%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=6szuQvjDu7qdG8SVG7xKoZAgQo5zIQhBR25vsE%2F50UU%3D&reserved=0<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fflink-cdc%2Fblob%2F23a1c2efb6fa9ce1c9f17b3836f6aaa995bb0660%2Fflink-cdc-connect%2Fflink-cdc-source-connectors%2Fflink-connector-mysql-cdc%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fflink%2Fcdc%2Fconnectors%2Fmysql%2Fsource%2Freader%2FMySqlRecordEmitter.java%23L77&data=05%7C02%7Ctejansh.rana%40autodesk.com%7Ccc84f50fbd34414f286408de1bc24231%7C67bff79e7f914433a8e5c9252d2ddc1d%7C0%7C0%7C638978718322412677%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=APMT7%2FgNDJZFza8K9JnG1OvAW1z2qyYzEyoiTI8%2Btx0%3D&reserved=0><https://github.com/apache/flink-cdc/blob/23a1c2efb6fa9ce1c9f17b3836f6aaa995bb0660/flink-cdc-connect/flink-cdc-source-connectors/flink-connector-mysql-cdc/src/main/java/org/apache/flink/cdc/connectors/mysql/source/reader/MySqlRecordEmitter.java#L77> >>> . >>> I have also attached a screenshot of the logs from this behaviour. >>> >>> This omission makes it challenging to reconstruct the original transaction >>> scope. Without explicit transaction markers, downstream Flink jobs cannot >>> easily guarantee atomicity across sinks. >>> >>> Proposed Solution: >>> The underlying CDC mechanism, Debezium, supports emitting transaction >>> boundary events (BEGIN and END/COMMIT) through its configuration. >>> >>> We propose enhancing the Flink MySQL CDC connector to expose this >>> transaction metadata to the Flink pipeline. The connector should emit >>> specialised records or metadata fields that indicate the start and end of a >>> transaction as emitted. We would be happy to create a PR with this feature >>> if this proposal goes ahead. >>> >>> Thank you, >>> Tejansh >>> >>> >>>
