[ 
https://issues.apache.org/jira/browse/FLINK-36683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Runkang He updated FLINK-36683:
-------------------------------
    Affects Version/s: cdc-3.2.0
                           (was: cdc-3.3.0)
                           (was: cdc-3.2.1)

> Support metadata 'row_kind' virtual column for Mongo CDC Connector
> ------------------------------------------------------------------
>
>                 Key: FLINK-36683
>                 URL: https://issues.apache.org/jira/browse/FLINK-36683
>             Project: Flink
>          Issue Type: Improvement
>          Components: Flink CDC
>    Affects Versions: cdc-3.2.0
>            Reporter: Runkang He
>            Priority: Major
>              Labels: pull-request-available
>
> 'row_kind' metadata is very useful in actual user scenarios, the two main 
> scenarios are below:
> 1. Save all upstream messages: In this scenario, the downstream will save all 
> message includes delete messages from upstream. To achieve this requirement, 
> we should convert full changelogs to append only message, and need to use 
> metadata row_kind to represent the changelog kind.
> 2. Ignore upstream delete messages: In this scenario, to save storage space, 
> the upstream cdc source often deletes historical data regularly and only 
> retains data within seven days. However, the business requires the downstream 
> OLAP system to retain the full amount of historical data, so it is necessary 
> to ignore the delete messages from source. A reasonable way is to use 
> metadata row_kind to filter out these delete messages.
> So I think we should support 'row_kind' metadata in Mongo CDC Connector.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to