Hi Shekhar,

Thanks for driving this initiative.


I notice that the Kafka queue semantics feature introduced in Kafka 4.1 is 
currently marked as not recommended for production use [1]. Since this feature 
requires the Kafka 4.1 client, we’ll need to factor that into our compatibility 
and deployment considerations.


Given that this model changes the consumption behaviour—specifically removing 
the concept of committed offsets—I’m curious about its impact on Kafka 
watermarking [2], which currently operates on a partition basis. Would it make 
sense to explore watermarking on a share group basis instead?


Also, I assume the proposed SplitReader implementation is designed to read from 
share groups rather than individual partitions. Could you confirm?


Kind regards,

David

⸻

Let me know if you'd like to tailor this further for a specific audience (e.g., 
more technical, more managerial), or if you'd like help drafting a follow-up or 
comment for the Google Doc.


[1] 
https://cwiki.apache.org/confluence/display/KAFKA/Queues+for+Kafka+%28KIP-932%29+-+Preview+Release+Notes
[2] 
https://nightlies.apache.org/flink/flink-docs-release-2.1/docs/dev/datastream/event-time/generating_watermarks/#watermark-strategies-and-the-kafka-connector

Kind regards, David.
From: RShekhar Prasad <[email protected]>
Date: Tuesday, 2 September 2025 at 15:07
To: [email protected] <[email protected]>
Subject: [EXTERNAL] Re: Kafka 4.x Queue Semantics in Flink Connector Kafka

Hi team,
The initial draft of this feature is now available in the Google Doc: 
https://docs.google.com/document/d/1_wf3GwqWV3yiD-0oG2d6jwN8bfdnbk-Kz13btva_QAc/edit?usp=sharing
 . Kindly review the document and share your feedback or suggestions in the 
comments to help enhance the content further.

Regards,
Shekhar Prasad Rajak,







   On Tuesday 2 September 2025 at 03:39:31 pm GMT+5:30, Gyula Fóra 
<[email protected]> wrote:

 Hello!

I see there is a WIP PR that has a bit more architecture / technical
information here: https://github.com/apache/flink-connector-kafka/pull/189
It would be nice to put this into a google doc proposal that follows the
FLIP format for discussion.

One question that came to me immediately is regarding the exactly-once
semantics. The current Flink kafka consumer implementation provides
exactly-once read semantics, but it's not clear if this new approach would
only have at-least-once or exactly-once is also possible.

Thanks
Gyula

On Sat, Aug 30, 2025 at 7:30 PM RShekhar Prasad <[email protected]>
wrote:

> Hello team,
> KIP-932 introduces share groups as a new consumption model that provides
> queue semantics.
> This directly addresses use cases where:
>
> 1. Multiple consumers need to process items efficiently in parallel from a
> single/multiple topic(s).
> 2. Messages  need explicit acknowledgment/release (to avoid reprocessing
> or allow retries).Use cases where scaling Flink ML/LLM workload is critical
> - Shifting Kafka coordination and assignment logic to the broker side would
> simplify today’s complex Flink source management, making consumption more
> efficient, scalable, and far less error-prone.
>  Operational Benefits
>
>  - Higher Throughput: ShareGroupHeartbeat helps in Queue-like workloads,
> maximum throughput scenarios. Share groups distribute messages at the
> record level, not partition level, so multiple readers can consume from the
> same topic with Kafka coordinating message distribution.
>  - Better Availability and Flexible Scaling: consumers assignment logic
> is simpler in server side and rebalancing frequency is minimised.
>
> Let's have discussion over the design and how the checkpointing will work
> when we use KafkaShareConsumer  API from Kafka 4.1 .
> Regards,
> Shekhar Prasad Rajak,
> Blog | Github | Twitter
>


Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: Building C, IBM Hursley Office, Hursley Park Road, 
Winchester, Hampshire SO21 2JN

Reply via email to