leonardBang commented on code in PR #3349:
URL: https://github.com/apache/flink-cdc/pull/3349#discussion_r1623938661
##########
docs/content/docs/connectors/flink-sources/postgres-cdc.md:
##########
@@ -236,6 +236,17 @@ Connector Options
so it does not need to be explicitly configured
'execution.checkpointing.checkpoints-after-tasks-finish.enabled' = 'true'
</td>
</tr>
+ <tr>
+ <td>scan.lsn-commit.checkpoints-num-delay</td>
+ <td>optional</td>
+ <td style="word-wrap: break-word;">3</td>
+ <td>Integer</td>
+ <td>The number of checkpoint delays before starting to commit the LSN
offsets. <br>
+ The checkpoint LSN offsets will be committed in rolling fashion, the
earliest checkpoint identifier will be committed first from the delayed
checkpoints.
+ This will enable continuous recycling of log files, preventing disk
space issues. <br>
+ This feature is not available in `PostgreSQLSource` since it is
deprecated.
Review Comment:
When consuming PostgreSQL logs, the LSN offset must be committed to trigger
the log data cleanup for the corresponding slot. However, once the LSN offset
is committed, earlier offsets become invalid. To ensure access to earlier LSN
offsets for job recovery, we delay the LSN commit by 3 checkpoints by default.
This feature is available when config option
`scan.incremental.snapshot.enabled` is set to true.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]