Francisco Morillo created FLINK-39813:
-----------------------------------------

             Summary: [Connectors/Kinesis] Add lineage support to Kinesis 
Streams connector
                 Key: FLINK-39813
                 URL: https://issues.apache.org/jira/browse/FLINK-39813
             Project: Flink
          Issue Type: Improvement
          Components: Connectors / Kinesis
    Affects Versions: aws-connector-6.0.0
            Reporter: Francisco Morillo


  The Kinesis Streams connector (flink-connector-aws-kinesis-streams) does not 
implement the LineageVertexProvider
  interface introduced in Flink 2.0. This means lineage tools like OpenLineage 
cannot automatically extract dataset
  information from Flink jobs using KDS sources and sinks.

  The Kafka connector already implements this successfully. This change adds 
the same capability to the Kinesis Streams
  connector.

  Implementation:
  - KinesisStreamsSource implements LineageVertexProvider, returning 
SourceLineageVertex with namespace derived from
  stream ARN
  - KinesisStreamsSink implements LineageVertexProvider, returning 
LineageVertex with the same ARN-based namespace
  - New utility package: KinesisLineageUtil, KinesisDatasetFacet, 
TypeDatasetFacet
  - Unit tests covering source, sink, and utility methods

  Namespace format: arn:\{partition}:kinesis:\{region}:\{account}:stream
  Dataset name: stream name extracted from the ARN

  This covers:
  - DataStream API (KinesisStreamsSource/Sink implement the interface directly)
  - SQL/Table API (KinesisDynamicSource internally creates 
KinesisStreamsSource, planner extracts lineage automatically)

  End-to-end tested with OpenLineage flink2 module (v1.47.1) confirming lineage 
events are emitted with correct
  namespace and stream name when using the DataStream API.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to