Hi Shekhar, Thanks for driving this initiative.
I notice that the Kafka queue semantics feature introduced in Kafka 4.1 is currently marked as not recommended for production use [1]. Since this feature requires the Kafka 4.1 client, we’ll need to factor that into our compatibility and deployment considerations. Given that this model changes the consumption behaviour—specifically removing the concept of committed offsets—I’m curious about its impact on Kafka watermarking [2], which currently operates on a partition basis. Would it make sense to explore watermarking on a share group basis instead? Also, I assume the proposed SplitReader implementation is designed to read from share groups rather than individual partitions. Could you confirm? Kind regards, David ⸻ Let me know if you'd like to tailor this further for a specific audience (e.g., more technical, more managerial), or if you'd like help drafting a follow-up or comment for the Google Doc. [1] https://cwiki.apache.org/confluence/display/KAFKA/Queues+for+Kafka+%28KIP-932%29+-+Preview+Release+Notes [2] https://nightlies.apache.org/flink/flink-docs-release-2.1/docs/dev/datastream/event-time/generating_watermarks/#watermark-strategies-and-the-kafka-connector Kind regards, David. From: RShekhar Prasad <[email protected]> Date: Tuesday, 2 September 2025 at 15:07 To: [email protected] <[email protected]> Subject: [EXTERNAL] Re: Kafka 4.x Queue Semantics in Flink Connector Kafka Hi team, The initial draft of this feature is now available in the Google Doc: https://docs.google.com/document/d/1_wf3GwqWV3yiD-0oG2d6jwN8bfdnbk-Kz13btva_QAc/edit?usp=sharing . Kindly review the document and share your feedback or suggestions in the comments to help enhance the content further. Regards, Shekhar Prasad Rajak, On Tuesday 2 September 2025 at 03:39:31 pm GMT+5:30, Gyula Fóra <[email protected]> wrote: Hello! I see there is a WIP PR that has a bit more architecture / technical information here: https://github.com/apache/flink-connector-kafka/pull/189 It would be nice to put this into a google doc proposal that follows the FLIP format for discussion. One question that came to me immediately is regarding the exactly-once semantics. The current Flink kafka consumer implementation provides exactly-once read semantics, but it's not clear if this new approach would only have at-least-once or exactly-once is also possible. Thanks Gyula On Sat, Aug 30, 2025 at 7:30 PM RShekhar Prasad <[email protected]> wrote: > Hello team, > KIP-932 introduces share groups as a new consumption model that provides > queue semantics. > This directly addresses use cases where: > > 1. Multiple consumers need to process items efficiently in parallel from a > single/multiple topic(s). > 2. Messages need explicit acknowledgment/release (to avoid reprocessing > or allow retries).Use cases where scaling Flink ML/LLM workload is critical > - Shifting Kafka coordination and assignment logic to the broker side would > simplify today’s complex Flink source management, making consumption more > efficient, scalable, and far less error-prone. > Operational Benefits > > - Higher Throughput: ShareGroupHeartbeat helps in Queue-like workloads, > maximum throughput scenarios. Share groups distribute messages at the > record level, not partition level, so multiple readers can consume from the > same topic with Kafka coordinating message distribution. > - Better Availability and Flexible Scaling: consumers assignment logic > is simpler in server side and rebalancing frequency is minimised. > > Let's have discussion over the design and how the checkpointing will work > when we use KafkaShareConsumer API from Kafka 4.1 . > Regards, > Shekhar Prasad Rajak, > Blog | Github | Twitter > Unless otherwise stated above: IBM United Kingdom Limited Registered in England and Wales with number 741598 Registered office: Building C, IBM Hursley Office, Hursley Park Road, Winchester, Hampshire SO21 2JN
