[ 
https://issues.apache.org/jira/browse/FLINK-39765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-39765:
-----------------------------------
    Labels: Flink-CDC pull-request-available  (was: Flink-CDC)

> Add FLIP-314 lineage support (LineageVertexProvider) to MySQL CDC source 
> connector
> ----------------------------------------------------------------------------------
>
>                 Key: FLINK-39765
>                 URL: https://issues.apache.org/jira/browse/FLINK-39765
>             Project: Flink
>          Issue Type: Improvement
>          Components: Flink CDC
>            Reporter: Jaskaran Singh Kukreja
>            Priority: Minor
>              Labels: Flink-CDC, pull-request-available
>
> *Summary*
> Implement Flink's {{LineageVertexProvider}} interface (FLIP-314) on 
> {{MySqlSource}} to enable native lineage reporting.
> *Motivation*
> Flink 1.20 introduced FLIP-314 which provides a native lineage API via 
> {{{}LineageVertexProvider{}}}. OpenLineage's Flink integration uses this to 
> extract source/sink datasets and emit lineage events. Currently, none of the 
> CDC source connectors implement {{{}LineageVertexProvider{}}}, so OpenLineage 
> events emitted from CDC pipelines contain empty inputs.
> *Proposed Changes*
> Implement {{LineageVertexProvider}} on {{MySqlSource}} to report captured 
> tables as input datasets with their schemas. The scope of this ticket targets 
> MySQL only, but shared lineage utilities will be added to 
> {{flink-cdc-common}} (e.g., {{{}LineageUtils{}}}, 
> {{{}CdcSourceLineageVertex{}}}, {{{}CdcLineageDataset{}}}) so that other 
> source connectors (Postgres, Oracle, etc.) can easily adopt lineage support 
> in follow-up PRs.
> *Result*
> Each captured MySQL table is reported as an input dataset using Flink native 
> APIs with openlineage standard:
>  * {*}Namespace{*}: {{mysql://hostname:port}}
>  * {*}Name{*}: resolved table name (e.g., {{{}mydb.users{}}})
>  * {*}Config facet{*}: source type ({{{}mysql-cdc{}}})
>  * {*}Schema facet{*}: column names and MySQL types
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to