ouyangwulin created FLINK-38277: ----------------------------------- Summary: Enhance postgresSQL slot management capabilities Key: FLINK-38277 URL: https://issues.apache.org/jira/browse/FLINK-38277 Project: Flink Issue Type: Improvement Components: Flink CDC Affects Versions: cdc-3.5.0 Reporter: ouyangwulin
Background Slot (Replication Slot) is a very important mechanism in PostgreSQL, which is closely related to Write-Ahead Logging (WAL). It is mainly used in Streaming Replication and Logical Replication to ensure that the primary library does not prematurely delete the WAL logs still needed by the standby library. Postgres connector makes use of Logical Replication for incremental data synchronization. At the same time, Flink cdc supports batch mode and streaming mode. If the slot is not deleted in batch mode, the postgres main library log data will increase, which will occupy a large amount of disk. 2. Enhance the solution 2.1.Batch mode pipeline is a batch controlled by execution.runtime-mode=batch, which requires scan.startup.mode=snapshot to run. Don't create slots when the job starts, and make sure to delete them when the job finishes executing, otherwise you'll have slots left over. SQL mode is controlled by scan.startup.mode=snapshot to only read full data, not read incremental data of course including backfill data do not read. Don't create slots when the job starts, and make sure to delete them when the job finishes executing, otherwise you'll have slots left over. 2.2.Streaming patterns When the job stops in streaming mode, it does not need to delete the slot, otherwise the job state will be lost, but the backfill in streaming mode will create a child slot, which needs to be deleted, otherwise it will lead to the child slot remaining. -- This message was sent by Atlassian Jira (v8.20.10#820010)