Hi Andrew,
Thanks for the response.

AM1: Sounds good to me, a final retry with a single record when the limit
is decreased makes sense.

Regards,
Apoorv Mittal


On Tue, Jan 13, 2026 at 10:22 AM Andrew Schofield <[email protected]>
wrote:

> Hi Apoorv,
> Thanks for the response.
>
> AM1: I suggest that we apply the limits at the time which makes most
> sense in the code, and that we avoid caching the limit for particular sets
> of records. There should not be one limit for older records and a different
> one for new records, which is what I think you were suggesting with the
> part about offset 500 in your latest comment.
>
> The delivery count limit is not bounds checked at the start of the delivery
> attempt because fetching is not concerned with archiving records.
> It is possible that a reduced limit means that a delivery attempt might be
> exceeding the current limit. In this case, we proceed with this as the
> final
> delivery attempt in terms of throttling. It's a boundary case that only
> occurs
> when the limit decreases and messages are already in-flight.
>
> When a delivery attempt completes, we check whether the number of
> deliveries has reached the current limit. If it has, the record will be
> archived.
> If the limit was decreased, it's possible that the limit has been exceeded,
> but that's not a problem and the record is archived anyway. If the limit
> was
> increased, it's possible that a record was nearing its final attempt but
> now it
> has more delivery attempts permitted. The throttling that occurs as the
> final
> attempts are made might have been applied previously, but can be relaxed
> to match the new limit.
>
> wdyt?
>
> Thanks,
> Andrew
>
> On 2026/01/08 12:51:59 Apoorv Mittal wrote:
> > Thanks for the response.
> >
> > AM1: That can be one approach. But I am also thinking of a scenario where
> > the user toggles to a lower limit and then adjusts back to a higher one
> > which can get really tricky in the implementation, do you think that can
> > bring more trouble? Maybe we should only consider the new limit for
> records
> > fetched post the limit change. For example, if inflight records are from
> > 0-500 offsets and the limit is changed then the new limit should be
> > applicable to post 500 offsets, that might be simpler and consistent in
> > implementation. For the immediate toggles or updates there might be a
> range
> > of limits but I expect that to vanish as soon the partition advances.
> What
> > do you think?
> >
> > Regards,
> > Apoorv Mittal
> > +44 7721681581
> >
> >
> > On Thu, Jan 8, 2026 at 10:33 AM Andrew Schofield <[email protected]>
> > wrote:
> >
> > > Hi Apoorv,
> > > Thanks for your comment.
> > >
> > > AM1: Batches which have already exceeded the new
> > > `share.delivery.count.limit`
> > > should have one final delivery attempt I think since the check against
> the
> > > limit
> > > occurs at the end of a delivery attempt. If that makes works for you, I
> > > will
> > > update the KIP.
> > >
> > > Thanks,
> > > Andrew
> > >
> > > On 2026/01/08 10:10:13 Apoorv Mittal wrote:
> > > > Hi Andrew,
> > > > Thanks for the KIP and it will be helpful to manage the
> configurations
> > > > dynamically. However, I have a follow up on Chia's question:
> > > >
> > > > AM1: Currently throttling in Share Partition adjusts the number of
> > > records
> > > > in share fetch request as per `group.share.delivery.count.limit`
> config
> > > > i.e. as the batch delivery count is incremented then number of
> records
> > > > returned in the response might get reduce depending on how close the
> > > > delivery count is against `group.share.delivery.count.limit`, finally
> > > > delivering a single record from the batch when delivery count is
> same as
> > > > `group.share.delivery.count.limit`. So if user dynamically sets
> > > > `share.delivery.count.limit` to a lower value from existing value
> then
> > > how
> > > > the current inflight batches should be treated i.e. the batches which
> > > have
> > > > already exceeded the new `share.delivery.count.limit` should be
> rejected
> > > > right away or should still consider the old limit?
> > > >
> > > > I am expecting the adjustment in throttled records to also carefully
> > > > consider when the `share.delivery.count.limit ` is increased but that
> > > will
> > > > be an implementation detail which needs to be carefully crafted.
> > > >
> > > > Regards,
> > > > Apoorv Mittal
> > > >
> > > >
> > > > On Wed, Jan 7, 2026 at 5:24 PM Andrew Schofield <
> [email protected]>
> > > > wrote:
> > > >
> > > > > Hi Chia-Ping,
> > > > > Thanks for your comments.
> > > > >
> > > > > chia_00: The group-level configs are all dynamic. This means that
> when
> > > the
> > > > > limits
> > > > > are reduced, they may already be exceeded by active usage. Over
> time,
> > > as
> > > > > records
> > > > > are delivered and locks are released, the system will settle
> within the
> > > > > new limits.
> > > > >
> > > > > chia_01: This is an interesting question and there is some work
> off the
> > > > > back of it.
> > > > >
> > > > > For the interval and timeout configs, the broker will fail to start
> > > when
> > > > > the group-level
> > > > > config lies outside the min/max specified by the static broker
> configs.
> > > > > However, the
> > > > > logging when the broker fails to start is unhelpful because it
> omits
> > > the
> > > > > group ID of
> > > > > the offending group. This behaviour is common for consumer groups
> and
> > > > > share groups.
> > > > > I haven't tried streams groups, but I expect they're the same. This
> > > should
> > > > > be improved
> > > > > in terms of logging at the very least so it's clear what needs to
> be
> > > done
> > > > > to get the broker
> > > > > started.
> > > > >
> > > > > For share.record.lock.duration.ms, no such validation occurs as
> the
> > > > > broker starts. This
> > > > > is an omission. We should have the same behaviour for all of the
> > > min/max
> > > > > bounds
> > > > > I think. My view is failing to start the broker is safest for now.
> > > > >
> > > > > For the new configs in the KIP, the broker should fail to start if
> the
> > > > > group-level config
> > > > > is outside the bounds of the min/max static broker configs.
> > > > >
> > > > > wdyt? I'll make a KIP update when I think we have consensus.
> > > > >
> > > > > Thanks,
> > > > > Andrew
> > > > >
> > > > > On 2026/01/05 13:56:16 Chia-Ping Tsai wrote:
> > > > > > hi Andrew
> > > > > >
> > > > > > Thanks for the KIP. I have a few questions regrading the
> > > configuration
> > > > > behaviour:
> > > > > >
> > > > > > chia_00: Dynamic Update Behavior
> > > > > > Are these new group-level configuration dynamic? Specifically,
> if we
> > > > > alter share.delivery.count.limit or
> share.partition.max.record.locks at
> > > > > runtime, will the changes take effect immediately for active share
> > > group?
> > > > > >
> > > > > > chia_01: Configuration Validation on Broker Restart
> > > > > > How does the broker handle existing group configuration that fall
> > > out of
> > > > > bounds after a broker restart? For example, suppose a group has
> > > > > share.partition.max.record.locks set to 100 (which is valid at the
> > > time).
> > > > > If the broker is later restarted with a stricter limit of
> > > > > group.share.max.partition.max.record.locks = 50, how will the group
> > > loaded
> > > > > handle this conflict?
> > > > > >
> > > > > > Best,
> > > > > > Chia-Ping
> > > > > >
> > > > > > On 2025/11/24 21:15:48 Andrew Schofield wrote:
> > > > > > > Hi,
> > > > > > > I’d like to start the discussion on a small KIP which adds some
> > > > > configurations for share groups which were previously only
> available as
> > > > > broker configurations.
> > > > > > >
> > > > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1240%3A+Additional+group+configurations+for+share+groups
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Andrew
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to