Hi Sanskar, Thanks for the feedback. I have added and additional section to proposed changes to highlight the issue and some debugging actions.
Regards, Sushant Mahajan On Wed, 15 Oct 2025, 01:53 Lianet M., <[email protected]> wrote: > Hi Sushant, thanks for the KIP. > > LM1: About the validation for brokers that may not support this > feature, the KIP proposes to validate on the "acknowledge" call, but > that does not build any request or have any relationship with the > NetworkClient (where the ApiVersions info lives), so how exactly are > we imagining we would have the node version to check on acknowledge? > It's usually when we build the request that we have the info handy, to > check if the Node version is not what the client needs for a feature > it is using. So here it would mean that the error would be discovered > on the call to poll or commit following the acknowledge I expect. I do > agree that it makes more sense to throw it on ack, but just pointing > out the challenge to make sure we thought about it. > > LM2: About the IllegalArgumentException when using an ack type not > supported by the broker, I wonder if it should be > UnsupportedVersionException? I think this is what we consistently > throw when the client attempts to use a feature not supported by the > broker version. It's actually a common pattern on the client (related > to both comments LM1 and LM2): check node version when building the > request and throwing UnsupportedVersionException depending on a > feature being used: ex. > > https://github.com/apache/kafka/blob/4cb5f1ca124b810b32fcef1ca23f00faf70603b6/clients/src/main/java/org/apache/kafka/common/requests/ListGroupsRequest.java#L52 > > LM3: About the client-side buffering proposed to be able to return the > renewed records on poll, I wonder if we should be concerned about > potential memory issues? I expect that an app would easily end up > calling RENEW continuously (common if it's failing to process every > record for a period of time?). In that case, how would that buffer be > bound? The KIP mentions that it would keep records "until they are > re-delivered by the broker", so does it mean that the buffer would > continue growing until the client app is able to ACCEPT or REJECT > records? > > LM4: On the new client-side buffer too, did we consider the > alternative of no extra buffering on the client, but just have the > broker re-send the renewed records to the member that has them? I'm > probably missing some broker-side implications of the share fetch, but > curious to understand the trade-offs (looking at it from the client > side, my first reaction is to avoid adding another buffering on the > client if possible, due to the complexities and implications, like > memory issues). > > Thanks! > Lianet > > On Tue, Oct 14, 2025 at 6:14 AM Sushant Mahajan <[email protected]> > wrote: > > > > Hi Andrew, > > I have modified the kip to throw IllegalArgumentException on unsupported > > ack type. > > > > Regards, > > Sushant Mahajan > > > > On Mon, 13 Oct 2025, 18:35 Sanskar Jhajharia, > > <[email protected]> wrote: > > > > > Hey Sushant, > > > Thanks for the KIP. Overall, the KIP looks like a good enhancement to > the > > > Share Consumer interface and helps unlock a significant use case. A > couple > > > of questions or suggestions: > > > > > > SJ1: If a record is a "poison pill" (a record that consistently causes > a > > > consumer to require a very long processing timeout), a malicious or > buggy > > > consumer could be programmed to continuously renew the lock to prevent > it > > > from ever reaching a failure state on the consumer and being moved to a > > > REJECTED or ARCHIVED state in the broker. Unlimited renewals can > create a > > > self-inflicted, persistent Head-of-Line (HOL) blocking issue for the > entire > > > share consumer group. Are there broker-side limits (e.g., a maximum > number > > > of renewals or a maximum cumulative lock duration) that can be > configured > > > to force a difficult record to eventually transition? > > > > > > SJ2: This is somehow linked to the comment `SJ1`. Would it make sense > to > > > improve visibility into lock renewal activity for better debugging and > > > operational awareness? For example, exposing broker-level metrics or > > > diagnostics that help operators identify cases of frequent or > ineffective > > > renewals could be valuable in understanding consumer or record-level > > > issues. These metrics may be essential for operators to debug > long-running > > > jobs, identify bottlenecks, and properly configure consumer timeouts. > > > > > > Regards, > > > Sanskar > > > > > > On Tue, Oct 7, 2025 at 3:46 PM Sushant Mahajan <[email protected]> > > > wrote: > > > > > > > I’d like to start the discussion for KIP-1222: Acquisition lock > timeout > > > > renewal in share consumer explicit mode. > > > > > > > > JIRA: https://issues.apache.org/jira/browse/KAFKA-19742 > > > > KIP Wiki: > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1222%3A+Acquisition+lock+timeout+renewal+in+share+consumer+explicit+mode > > > > > > > > Regards, > > > > Sushant Mahajan > > > > > > > >
