Hi Andrew, Thanks for the response. AM1: Sounds good to me, a final retry with a single record when the limit is decreased makes sense.
Regards, Apoorv Mittal On Tue, Jan 13, 2026 at 10:22 AM Andrew Schofield <[email protected]> wrote: > Hi Apoorv, > Thanks for the response. > > AM1: I suggest that we apply the limits at the time which makes most > sense in the code, and that we avoid caching the limit for particular sets > of records. There should not be one limit for older records and a different > one for new records, which is what I think you were suggesting with the > part about offset 500 in your latest comment. > > The delivery count limit is not bounds checked at the start of the delivery > attempt because fetching is not concerned with archiving records. > It is possible that a reduced limit means that a delivery attempt might be > exceeding the current limit. In this case, we proceed with this as the > final > delivery attempt in terms of throttling. It's a boundary case that only > occurs > when the limit decreases and messages are already in-flight. > > When a delivery attempt completes, we check whether the number of > deliveries has reached the current limit. If it has, the record will be > archived. > If the limit was decreased, it's possible that the limit has been exceeded, > but that's not a problem and the record is archived anyway. If the limit > was > increased, it's possible that a record was nearing its final attempt but > now it > has more delivery attempts permitted. The throttling that occurs as the > final > attempts are made might have been applied previously, but can be relaxed > to match the new limit. > > wdyt? > > Thanks, > Andrew > > On 2026/01/08 12:51:59 Apoorv Mittal wrote: > > Thanks for the response. > > > > AM1: That can be one approach. But I am also thinking of a scenario where > > the user toggles to a lower limit and then adjusts back to a higher one > > which can get really tricky in the implementation, do you think that can > > bring more trouble? Maybe we should only consider the new limit for > records > > fetched post the limit change. For example, if inflight records are from > > 0-500 offsets and the limit is changed then the new limit should be > > applicable to post 500 offsets, that might be simpler and consistent in > > implementation. For the immediate toggles or updates there might be a > range > > of limits but I expect that to vanish as soon the partition advances. > What > > do you think? > > > > Regards, > > Apoorv Mittal > > +44 7721681581 > > > > > > On Thu, Jan 8, 2026 at 10:33 AM Andrew Schofield <[email protected]> > > wrote: > > > > > Hi Apoorv, > > > Thanks for your comment. > > > > > > AM1: Batches which have already exceeded the new > > > `share.delivery.count.limit` > > > should have one final delivery attempt I think since the check against > the > > > limit > > > occurs at the end of a delivery attempt. If that makes works for you, I > > > will > > > update the KIP. > > > > > > Thanks, > > > Andrew > > > > > > On 2026/01/08 10:10:13 Apoorv Mittal wrote: > > > > Hi Andrew, > > > > Thanks for the KIP and it will be helpful to manage the > configurations > > > > dynamically. However, I have a follow up on Chia's question: > > > > > > > > AM1: Currently throttling in Share Partition adjusts the number of > > > records > > > > in share fetch request as per `group.share.delivery.count.limit` > config > > > > i.e. as the batch delivery count is incremented then number of > records > > > > returned in the response might get reduce depending on how close the > > > > delivery count is against `group.share.delivery.count.limit`, finally > > > > delivering a single record from the batch when delivery count is > same as > > > > `group.share.delivery.count.limit`. So if user dynamically sets > > > > `share.delivery.count.limit` to a lower value from existing value > then > > > how > > > > the current inflight batches should be treated i.e. the batches which > > > have > > > > already exceeded the new `share.delivery.count.limit` should be > rejected > > > > right away or should still consider the old limit? > > > > > > > > I am expecting the adjustment in throttled records to also carefully > > > > consider when the `share.delivery.count.limit ` is increased but that > > > will > > > > be an implementation detail which needs to be carefully crafted. > > > > > > > > Regards, > > > > Apoorv Mittal > > > > > > > > > > > > On Wed, Jan 7, 2026 at 5:24 PM Andrew Schofield < > [email protected]> > > > > wrote: > > > > > > > > > Hi Chia-Ping, > > > > > Thanks for your comments. > > > > > > > > > > chia_00: The group-level configs are all dynamic. This means that > when > > > the > > > > > limits > > > > > are reduced, they may already be exceeded by active usage. Over > time, > > > as > > > > > records > > > > > are delivered and locks are released, the system will settle > within the > > > > > new limits. > > > > > > > > > > chia_01: This is an interesting question and there is some work > off the > > > > > back of it. > > > > > > > > > > For the interval and timeout configs, the broker will fail to start > > > when > > > > > the group-level > > > > > config lies outside the min/max specified by the static broker > configs. > > > > > However, the > > > > > logging when the broker fails to start is unhelpful because it > omits > > > the > > > > > group ID of > > > > > the offending group. This behaviour is common for consumer groups > and > > > > > share groups. > > > > > I haven't tried streams groups, but I expect they're the same. This > > > should > > > > > be improved > > > > > in terms of logging at the very least so it's clear what needs to > be > > > done > > > > > to get the broker > > > > > started. > > > > > > > > > > For share.record.lock.duration.ms, no such validation occurs as > the > > > > > broker starts. This > > > > > is an omission. We should have the same behaviour for all of the > > > min/max > > > > > bounds > > > > > I think. My view is failing to start the broker is safest for now. > > > > > > > > > > For the new configs in the KIP, the broker should fail to start if > the > > > > > group-level config > > > > > is outside the bounds of the min/max static broker configs. > > > > > > > > > > wdyt? I'll make a KIP update when I think we have consensus. > > > > > > > > > > Thanks, > > > > > Andrew > > > > > > > > > > On 2026/01/05 13:56:16 Chia-Ping Tsai wrote: > > > > > > hi Andrew > > > > > > > > > > > > Thanks for the KIP. I have a few questions regrading the > > > configuration > > > > > behaviour: > > > > > > > > > > > > chia_00: Dynamic Update Behavior > > > > > > Are these new group-level configuration dynamic? Specifically, > if we > > > > > alter share.delivery.count.limit or > share.partition.max.record.locks at > > > > > runtime, will the changes take effect immediately for active share > > > group? > > > > > > > > > > > > chia_01: Configuration Validation on Broker Restart > > > > > > How does the broker handle existing group configuration that fall > > > out of > > > > > bounds after a broker restart? For example, suppose a group has > > > > > share.partition.max.record.locks set to 100 (which is valid at the > > > time). > > > > > If the broker is later restarted with a stricter limit of > > > > > group.share.max.partition.max.record.locks = 50, how will the group > > > loaded > > > > > handle this conflict? > > > > > > > > > > > > Best, > > > > > > Chia-Ping > > > > > > > > > > > > On 2025/11/24 21:15:48 Andrew Schofield wrote: > > > > > > > Hi, > > > > > > > I’d like to start the discussion on a small KIP which adds some > > > > > configurations for share groups which were previously only > available as > > > > > broker configurations. > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1240%3A+Additional+group+configurations+for+share+groups > > > > > > > > > > > > > > Thanks, > > > > > > > Andrew > > > > > > > > > > > > > > > > > > > > > > > > > > > >
