Hi everyone,
Kafka 4.0 introduced Share Groups (KIP-932) -- a competing-consumer model where 
multiple consumers read from the same partition with per-record acknowledgment. 
This brings task-queue semantics to Kafka, a capability that has no equivalent 
in the current Flink Kafka connector.
I would like to propose adding a KafkaShareGroupSource to the Flink Kafka 
connector. The core idea:

•  Wrap KafkaShareConsumer in EXPLICIT acknowledgment mode
•  RENEW acquisition locks on every poll cycle to keep records locked during 
processing
•  PAUSE the fetcher on checkpoint barrier, ACCEPT all in-flight records on 
checkpoint completion, then RESUME
•  On failure, un-acknowledged records auto-release on the broker after lock 
timeout and are re-delivered to any available consumer -- at-least-once 
guaranteed, zero Flink-side record state needed
The design introduces one user-facing switch: scan.share-group.id in SQL or 
setShareGroupId() in DataStream API. Everything else is internal. No changes to 
existing KafkaSource, KafkaSink, or any existing tests. Fully backward 
compatible.
The full proposal is here: [FLIP-573: Kafka Share Groups for Queue-Like Event 
Processing in Flink Connector Kafka-link]
I would appreciate feedback on the overall approach, particularly: Any concerns 
about the broker-side acquisition lock timeout interaction with Flink 
checkpointing?


Regards,

Shekharrajak

Reply via email to