Vishal P R created FLINK-38991:
----------------------------------

             Summary: MySQL CDC fails to detect online schema changes with 
custom gh-ost or pt-osc table naming
                 Key: FLINK-38991
                 URL: https://issues.apache.org/jira/browse/FLINK-38991
             Project: Flink
          Issue Type: Improvement
          Components: Flink CDC
    Affects Versions: cdc-3.5.0
            Reporter: Vishal P R


Flink CDC MySQL connector currently uses a hardcoded regular expression 
{{^_(.*)_(gho|new)$}} to identify temporary tables created by online schema 
change tools gh-ost and pt-osc.

This assumes fixed naming conventions for temporary tables. However, both tools 
support custom table naming:
 * gh-ost allows custom names via the {{-force-table-names}} option.

 * pt-osc supports custom table names through the {{--new-table-name}} option.

When these options are used, temporary table names may no longer match the 
hardcoded pattern, causing the connector to fail to correctly detect and 
associate temporary tables with their original tables. This can result in 
online schema changes being missed or incorrectly parsed.


A possible solution would be to introduce a new configuration option that 
allows users to define a custom regular expression for identifying temporary 
tables created by gh-ost or pt-osc.

*Spec for configuration option:*
 * *Key:* {{scan.parse.online.schema.changes.pattern}}

 * *Type:* {{String}}

 * *Default:* {{^_(?<table>.*)_(gho|new)$}}

 * *Description:* Regular expression with a named capturing group (?<table>...) 
used to identify temporary tables created by gh-ost or pt-osc utilities and map 
them back to their original tables.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to