qidian99 opened a new issue, #1563: URL: https://github.com/apache/incubator-paimon/issues/1563
### Search before asking - [X] I searched in the [issues](https://github.com/apache/incubator-paimon/issues) and found nothing similar. ### Paimon version 0.5-SNAPSHOT ### Compute Engine Flink ### Minimal reproduce step When restarting a job from a savepoint in the Apache Flink CDC (Change Data Capture) engine, it is crucial to consider the impact of topology changes and the proper handling of hash UIDs for state recovery. For instance, let's consider a CDC job that captures changes from a source database and writes them to a target sink. Suppose the initial savepoint is taken when the job processes tables A and B from the source database. Now, if a new table C is added to the source database and the job is restarted from the savepoint, the absence of hash UIDs for the operators involved in processing table C can result in state unavailability. To address this issue effectively, it is necessary to determine which states should be ignored or skipped when new tables are introduced. By properly configuring the hash UIDs for all affected operators, the CDC engine can ensure that the state associated with each operator is correctly identified and recovered during the restart process. This ensures the reliability and completeness of the CDC job, even when there are changes in the job's topology. Please note that this issue specifically pertains to the Apache Flink CDC engine and its handling of job recovery from savepoints when there are changes in the job's topology. ### What doesn't meet your expectations? Now Paimon does not assign UIDs to tables. Note that UIDs should relate to both database and tables, or there might be conflict when there are two tables with same name but in different database. ### Anything else? _No response_ ### Are you willing to submit a PR? - [X] I'm willing to submit a PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
