[
https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17783147#comment-17783147
]
Fang Yong commented on FLINK-31275:
-----------------------------------
Hi [~mobuchowski] Thanks for your reply.
I think our ideas are consistent, just at different levels of abstraction. The
interface `LineageVertex` is the top interface for connectors in Flink, and we
implement `TableLineageVertex` for tables, because a Table is a complete
definition, including the database, schema, etc. We put the options in the
`with` into a map, which is consistent with the definition and usage habits of
SQL in Flink.
For the official Flink connectors, we will implement the `LineageVertex` for
`Source` and `InputFormat` for `DataStream` jobs, such as `KafkaSourceLineage`,
etc, as we mentioned in FLINK: `We will implement LineageVertexProvider for
the builtin source and sink such as KafkaSource , HiveSource ,
FlinkKafkaProducerBase and etc.`.
End-users don't need to implement them. In order to be consistent with the
usage habits of tables, we will put the corresponding information into a map
when implementing it, and users can obtain it.
So, I think our current point of divergence is which level of abstraction the
user needs to perceive. In the current FLIP, for DataStream jobs, listener
developers need to identify whether the `LineageVertex` is a
`KafkaSourceLineageVertex` or a `JdbcLineageVertex`. You mean we need to define
another layer, such as the `DataSetConfig` interface, and then the listener
developer can identify whether it is a `KafkaDataSetConfig` or a
`JdbcDataSetConfig`, right?
Our current use of `LineageVertexis` mainly to consider flexibility and
facilitate the addition of returned information in the lineage vertex of the
`DataStream`, such as the vector type data source information mentioned in the
FLIP example. At the same time, connector maintainers can also easily provide
lineage vertex for customized connectors. If the connector is in table format,
we prefer that users directly provide a TableLineageVertex instance.
> Flink supports reporting and storage of source/sink tables relationship
> -----------------------------------------------------------------------
>
> Key: FLINK-31275
> URL: https://issues.apache.org/jira/browse/FLINK-31275
> Project: Flink
> Issue Type: Improvement
> Components: Table SQL / Planner
> Affects Versions: 1.18.0
> Reporter: Fang Yong
> Assignee: Fang Yong
> Priority: Major
>
> FLIP-314 has been accepted
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener
--
This message was sent by Atlassian Jira
(v8.20.10#820010)