Xin Gao created FLINK-39238:
-------------------------------

             Summary: Support watermark alignment in flink-connector-kafka in 
dynamic Kafka source mode
                 Key: FLINK-39238
                 URL: https://issues.apache.org/jira/browse/FLINK-39238
             Project: Flink
          Issue Type: Improvement
          Components: Connectors / Kafka
            Reporter: Xin Gao


Flink source ingestion is with watermark alignment config like 
`scan.watermark.alignment.max-drift`. The operator would then control
 * checkWatermarkAlignment for whole reader
 * checkSplitWatermarkAlignment for the split reader

In dynamic Kafka source mode, the split reader pause/resume is missed 
(`pauseOrResumeSplits` not implemented) and the exceptions might raise once 
such config enabled.

This feature is important for workflows like ingesting data from Kafka to Data 
Lake. Missing the alignment could create far more number of small files once 
the ingestion progress diverges and runs our of controls.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to