[ 
https://issues.apache.org/jira/browse/FLINK-35419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arvid Heise updated FLINK-35419:
--------------------------------
    Fix Version/s: kafka-3.3.0

> scan.bounded.latest-offset makes queries never finish if the latest message 
> is a EndTxn Kafka marker
> ----------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-35419
>                 URL: https://issues.apache.org/jira/browse/FLINK-35419
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / Kafka
>    Affects Versions: 1.8.0, 1.16.0, 1.17.0, 1.19.0
>            Reporter: Fabian Paul
>            Priority: Major
>             Fix For: kafka-3.3.0
>
>
> When running the kafka connector in bounded mode, the stop condition can be 
> defined as the latest offset when the job starts. Unfortunately, Kafka's 
> latest offset calculation also includes special marker records, such as 
> transaction markers, in the overall count.
>  
> When Flink waits for a job to finish, it compares the number of records read 
> until the point with the original latest offset [1]. Since the consumer will 
> never see the special marker records, the latest offset is never reached, and 
> the job gets stuck. 
>  
> To reproduce the issue, you can write into a Kafka topic and make sure that 
> the latest record is a transaction end event. Afterwards you can start a 
> Flink job configured with `scan.bounded.latest-offset` pointing to that topic.
>  
> [1]https://github.com/confluentinc/flink/blob/59c5446c4aac0d332a21b456f4a3f82576104b80/flink-connectors/confluent-connector-kafka/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/reader/KafkaPartitionSplitReader.java#L128



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to