Hi Sushant, thanks for the KIP. LM1: About the validation for brokers that may not support this feature, the KIP proposes to validate on the "acknowledge" call, but that does not build any request or have any relationship with the NetworkClient (where the ApiVersions info lives), so how exactly are we imagining we would have the node version to check on acknowledge? It's usually when we build the request that we have the info handy, to check if the Node version is not what the client needs for a feature it is using. So here it would mean that the error would be discovered on the call to poll or commit following the acknowledge I expect. I do agree that it makes more sense to throw it on ack, but just pointing out the challenge to make sure we thought about it.
LM2: About the IllegalArgumentException when using an ack type not supported by the broker, I wonder if it should be UnsupportedVersionException? I think this is what we consistently throw when the client attempts to use a feature not supported by the broker version. It's actually a common pattern on the client (related to both comments LM1 and LM2): check node version when building the request and throwing UnsupportedVersionException depending on a feature being used: ex. https://github.com/apache/kafka/blob/4cb5f1ca124b810b32fcef1ca23f00faf70603b6/clients/src/main/java/org/apache/kafka/common/requests/ListGroupsRequest.java#L52 LM3: About the client-side buffering proposed to be able to return the renewed records on poll, I wonder if we should be concerned about potential memory issues? I expect that an app would easily end up calling RENEW continuously (common if it's failing to process every record for a period of time?). In that case, how would that buffer be bound? The KIP mentions that it would keep records "until they are re-delivered by the broker", so does it mean that the buffer would continue growing until the client app is able to ACCEPT or REJECT records? LM4: On the new client-side buffer too, did we consider the alternative of no extra buffering on the client, but just have the broker re-send the renewed records to the member that has them? I'm probably missing some broker-side implications of the share fetch, but curious to understand the trade-offs (looking at it from the client side, my first reaction is to avoid adding another buffering on the client if possible, due to the complexities and implications, like memory issues). Thanks! Lianet On Tue, Oct 14, 2025 at 6:14 AM Sushant Mahajan <[email protected]> wrote: > > Hi Andrew, > I have modified the kip to throw IllegalArgumentException on unsupported > ack type. > > Regards, > Sushant Mahajan > > On Mon, 13 Oct 2025, 18:35 Sanskar Jhajharia, > <[email protected]> wrote: > > > Hey Sushant, > > Thanks for the KIP. Overall, the KIP looks like a good enhancement to the > > Share Consumer interface and helps unlock a significant use case. A couple > > of questions or suggestions: > > > > SJ1: If a record is a "poison pill" (a record that consistently causes a > > consumer to require a very long processing timeout), a malicious or buggy > > consumer could be programmed to continuously renew the lock to prevent it > > from ever reaching a failure state on the consumer and being moved to a > > REJECTED or ARCHIVED state in the broker. Unlimited renewals can create a > > self-inflicted, persistent Head-of-Line (HOL) blocking issue for the entire > > share consumer group. Are there broker-side limits (e.g., a maximum number > > of renewals or a maximum cumulative lock duration) that can be configured > > to force a difficult record to eventually transition? > > > > SJ2: This is somehow linked to the comment `SJ1`. Would it make sense to > > improve visibility into lock renewal activity for better debugging and > > operational awareness? For example, exposing broker-level metrics or > > diagnostics that help operators identify cases of frequent or ineffective > > renewals could be valuable in understanding consumer or record-level > > issues. These metrics may be essential for operators to debug long-running > > jobs, identify bottlenecks, and properly configure consumer timeouts. > > > > Regards, > > Sanskar > > > > On Tue, Oct 7, 2025 at 3:46 PM Sushant Mahajan <[email protected]> > > wrote: > > > > > I’d like to start the discussion for KIP-1222: Acquisition lock timeout > > > renewal in share consumer explicit mode. > > > > > > JIRA: https://issues.apache.org/jira/browse/KAFKA-19742 > > > KIP Wiki: > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1222%3A+Acquisition+lock+timeout+renewal+in+share+consumer+explicit+mode > > > > > > Regards, > > > Sushant Mahajan > > > > >
