[
https://issues.apache.org/jira/browse/FLINK-39718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yanquan Lv resolved FLINK-39718.
--------------------------------
Fix Version/s: cdc-3.6.0
Assignee: Thorne
Resolution: Fixed
Merged in master via 7acc625f1ed9ab974f74cc60449c75886c0ca583.
> Paimon pipeline sink fails with distributed source when target table does not
> exist
> -----------------------------------------------------------------------------------
>
> Key: FLINK-39718
> URL: https://issues.apache.org/jira/browse/FLINK-39718
> Project: Flink
> Issue Type: Bug
> Components: Flink CDC
> Reporter: Ran Tao
> Assignee: Thorne
> Priority: Major
> Labels: pull-request-available
> Fix For: cdc-3.6.0
>
>
> When using a distributed pipeline source with Paimon sink, the job may fail
> if the target table does not already exist.
> The failure happens in *DistributedPrePartitionOperator*. The previous
> *PaimonHashFunction* rebuilds the hash function at this stage and immediately
> accesses the external Paimon catalog to load the target table. If the table
> has not been created yet, *catalog.getTable(...)* throws
> TableNotExistException, and the job fails in the pre-partition stage.
> This issue is usually not exposed in MySQL-CDC pipelines because MySQL-CDC
> uses regular topology. In that path, CreateTableEvent is handled by
> SchemaOperator first, and MetadataApplier creates the downstream table before
> records enter *RegularPrePartitionOperator*. As a result, the previous
> PaimonHashFunction can usually find the target table from the catalog.
> For distributed pipeline sources, the execution order is different:
> pre-partitioning happens before the target table is created by
> *MetadataApplier* in the schema coordination phase.
> This issue is not specific to Kafka. It can affect any distributed pipeline
> source with the same execution order.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)