[
https://issues.apache.org/jira/browse/FLINK-38991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vishal P R updated FLINK-38991:
-------------------------------
Description:
Flink CDC MySQL connector currently uses a hardcoded regular expression
{{^\_(.*)\_(gho|new)$}} to identify temporary tables created by online schema
change tools gh-ost and pt-osc.
This assumes fixed naming conventions for temporary tables. However, both tools
support custom table naming:
* gh-ost allows custom names via the {{-force-table-names}} option.
* pt-osc supports custom table names through the {{--new-table-name}} option.
When these options are used, temporary table names may no longer match the
hardcoded pattern, causing the connector to fail to correctly detect and
associate temporary tables with their original tables. This can result in
online schema changes being missed or incorrectly parsed.
A possible solution would be to introduce a new configuration option that
allows users to define a custom regular expression for identifying temporary
tables created by gh-ost or pt-osc.
*Spec for configuration option:*
* *Key:* {{scan.parse.online.schema.changes.pattern}}
* *Type:* {{String}}
* *Default:* {{^\_(?<table>.*)\_(gho|new)$}}
* *Description:* Regular expression with a named capturing group (?<table>...)
used to identify temporary tables created by gh-ost or pt-osc utilities and map
them back to their original tables.
was:
Flink CDC MySQL connector currently uses a hardcoded regular expression
{{^_(.*)_(gho|new)$}} to identify temporary tables created by online schema
change tools gh-ost and pt-osc.
This assumes fixed naming conventions for temporary tables. However, both tools
support custom table naming:
* gh-ost allows custom names via the {{-force-table-names}} option.
* pt-osc supports custom table names through the {{--new-table-name}} option.
When these options are used, temporary table names may no longer match the
hardcoded pattern, causing the connector to fail to correctly detect and
associate temporary tables with their original tables. This can result in
online schema changes being missed or incorrectly parsed.
A possible solution would be to introduce a new configuration option that
allows users to define a custom regular expression for identifying temporary
tables created by gh-ost or pt-osc.
*Spec for configuration option:*
* *Key:* {{scan.parse.online.schema.changes.pattern}}
* *Type:* {{String}}
* *Default:* {{^_(?<table>.*)_(gho|new)$}}
* *Description:* Regular expression with a named capturing group (?<table>...)
used to identify temporary tables created by gh-ost or pt-osc utilities and map
them back to their original tables.
> MySQL CDC fails to detect online schema changes with custom gh-ost or pt-osc
> table naming
> -----------------------------------------------------------------------------------------
>
> Key: FLINK-38991
> URL: https://issues.apache.org/jira/browse/FLINK-38991
> Project: Flink
> Issue Type: Improvement
> Components: Flink CDC
> Affects Versions: cdc-3.5.0
> Reporter: Vishal P R
> Priority: Major
>
> Flink CDC MySQL connector currently uses a hardcoded regular expression
> {{^\_(.*)\_(gho|new)$}} to identify temporary tables created by online schema
> change tools gh-ost and pt-osc.
> This assumes fixed naming conventions for temporary tables. However, both
> tools support custom table naming:
> * gh-ost allows custom names via the {{-force-table-names}} option.
> * pt-osc supports custom table names through the {{--new-table-name}} option.
> When these options are used, temporary table names may no longer match the
> hardcoded pattern, causing the connector to fail to correctly detect and
> associate temporary tables with their original tables. This can result in
> online schema changes being missed or incorrectly parsed.
> A possible solution would be to introduce a new configuration option that
> allows users to define a custom regular expression for identifying temporary
> tables created by gh-ost or pt-osc.
> *Spec for configuration option:*
> * *Key:* {{scan.parse.online.schema.changes.pattern}}
> * *Type:* {{String}}
> * *Default:* {{^\_(?<table>.*)\_(gho|new)$}}
> * *Description:* Regular expression with a named capturing group
> (?<table>...) used to identify temporary tables created by gh-ost or pt-osc
> utilities and map them back to their original tables.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)