davidradl commented on code in PR #24618:
URL: https://github.com/apache/flink/pull/24618#discussion_r1613290123
##########
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/lineage/LineageGraph.java:
##########
@@ -20,13 +20,12 @@
package org.apache.flink.streaming.api.lineage;
import org.apache.flink.annotation.PublicEvolving;
-import org.apache.flink.streaming.api.graph.StreamGraph;
import java.util.List;
/**
- * Job lineage is built according to {@link StreamGraph}, users can get
sources, sinks and
- * relationships from lineage and manage the relationship between jobs and
tables.
+ * Job lineage graph that users can get sources, sinks and relationships from
lineage and manage the
Review Comment:
> Thanks David for your comments. Yes, the documentation will be added after
adding the job lineage listener which is more user facing. It is planned in
this jira https://issues.apache.org/jira/browse/FLINK-33212. This PR only
consider source/sink level lineage. Column level lineage is not included for
this work, so internal transformations not need lineage info for now. Would you
please elaborate more about "I assume a sink could be a source - so could be in
both current lists"?
Hi Peter, usually we think of lineage assets as the nodes in the lineage
(e.g. open lineage). So the asset could be a Kafka topic and that topic would
be being used as a source for some flows and a sink for other flows. I was
wondering how this fits with lineage at the table level, where there could be
a table defined as a sink and a table defined as a source on the same Kafka
topic. I guess when exporting / exposing to open lineage there could be many
Flink tables referring to the same topic that would end up as one open lineage
node. The natural way for Flink to store the lineage is at the table level -
rather than at the asset level. So thinking about it, I think this is fine.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]