Hi Chirag, The KIP looks good from my point of view now. Thanks, Andrew
> On 17 Oct 2025, at 08:40, CHIRAG WADHWA <[email protected]> wrote: > > Hi Apoorv, > Thanks for the suggestions. Please find my responses below: > > AM1: That was a small oversight on my part. Yes, for regular consumers, lag > is always calculated using the read_uncommitted isolation level. I’ve > updated the KIP to specify that the share partition lag calculation will > also rely on the Log End Offset (LEO)—obtained using the read_uncommitted > isolation level—to determine the upper bound. > > AM2: All offsets that lie before the Share Partition Start Offset (SPSO) > are considered processed by the share consumers, whereas all offsets beyond > the Share Partition End Offset (SPEO) are treated as candidates for future > processing, regardless of whether they correspond to control records, > compacted records, or regular records. However, within the range between > SPSO and SPEO, there may be cases where certain control or compacted > records have already been identified and excluded from the share partition > lag, while others are yet to be recognized and still contribute to it. This > nuanced handling of offsets is what differentiates share consumption from > regular consumption, which is why I wanted to highlight it. > > AM3: I’ve updated the KIP to move this part under the Motivation section. > > Regards, > Chirag Wadhwa > > On Thu, 16 Oct 2025 at 21:58, Apoorv Mittal <[email protected]> > wrote: > >> Hi Chirag, >> Thanks for the KIP, this is a very helpful feature to have for the share >> groups. Some comments on the KIP: >> >> AM1: Though having read_committed and read_uncommitted isolation levels >> while determining the highest offset makes complete sense, but I was >> wondering that lag might change if the group isolation level is switched, >> which might add confusion to customers. Also, consumer groups compute lag >> using read_uncommitted itself so maybe we just have parity with consumer >> groups and keep it simple for share groups as well, by considering >> read_uncommitted itself. wdyt? >> >> AM2: I am not sure what we mean by the following text "However, offsets >> within the in-flight boundary (between SPSO and SPEO ) require additional >> handling so that the lag more accurately reflects the number of records to >> be processed.", can you please help. >> >> AM3: Not sure if this text aligns in the Persistence section or should go >> in motivation: "Looking ahead, the plan is to implement an assignor that >> allocates members to partitions based on partition-level backlogs." >> >> >> Regards, >> Apoorv Mittal >> >> >> On Wed, Oct 15, 2025 at 12:23 PM Chirag Wadhwa >> <[email protected]> >> wrote: >> >>> Hi Andrew, >>> Thanks for the suggestions. >>> >>> Regarding the first point, the KIP has been updated to include a new >>> subsection that talks about control records, as well as the compacted >>> records. >>> Regarding the second point, I personally resonate more with >>> DeliveryCompleteCount. The schemas in the KIP have also been updated >>> accordingly. >>> >>> Thanks, >>> Chirag >>> >>> On Tue, Oct 14, 2025 at 6:45 PM Andrew Schofield < >>> [email protected]> >>> wrote: >>> >>>> Hi Chirag, >>>> Thanks for the KIP. I have a few comments. >>>> >>>> AS1: The calculation of the lag needs to take into account offsets >> which >>>> are >>>> not occupied by records, such as when they’ve been removed due to >>>> compaction. >>>> Also, the offsets which correspond to control records need to be taken >>>> into account. >>>> Please update the text to make this clear. >>>> >>>> AS2: The name InFlightTerminalRecords in the schemas seems a bit >> strange >>>> to me. What you are doing is calculating the offsets after the SPSO for >>>> which >>>> delivery is complete, either because the records are acknowledged or >>>> archived, >>>> or because they are control records, or because the offsets do not >>>> correspond to >>>> records at all. Personally, I only think of the in-flight records as >>> being >>>> those >>>> between the SPSO and the SPEO which have one of the delivery states, >>>> not those which never did. >>>> >>>> I’ve been very careful to exclude the SPEO from the external >> interfaces, >>>> because >>>> one day I expect to change the code so that the in-flight records are >>>> sparse >>>> and the distance between the SPSO and the SPEO can be much greater. >>>> The concept of lag in this KIP needs to be flexible enough to >> accommodate >>>> this. >>>> >>>> I wonder whether a name like DeliveryCompleteCount or >>>> DeliveryCompleteRecords >>>> instead of InFlightTerminalRecords would be better. This is the number >> of >>>> offsets >>>> after the SPSO for which the records have completed delivery, either >>>> because they’re >>>> in a terminal state, or because no delivery is required. Wdyt? >>>> >>>> >>>> Thanks, >>>> Andrew >>>> >>>>> On 9 Oct 2025, at 14:43, CHIRAG WADHWA <[email protected] >>> >>>> wrote: >>>>> >>>>> I'd like to start the discussion for KIP-1226: Introducing Share >>>> Partition >>>>> Lag Persistence and Retrieval. >>>>> >>>>> KIP Wiki: >>>>> >>>> >>> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1226:+Introducing+Share+Partition+Lag+Persistence+and+Retrieval >>>>> >>>>> Regards, >>>>> Chirag Wadhwa >>>> >>>> >>> >>> -- >>> >>> [image: Confluent] <https://www.confluent.io/> >>> Chirag Wadhwa >>> Software Engineer >>> +91 9873590730 <+91+9873590730> >>> Follow us: [image: Blog] >>> < >>> >> https://www.confluent.io/blog?utm_source=footer&utm_medium=email&utm_campaign=ch.email-signature_type.community_content.blog >>>> [image: >>> Twitter] <https://twitter.com/ConfluentInc> >>> >>> [image: Try Confluent Cloud for Free] >>> < >>> >> https://www.confluent.io/get-started?utm_campaign=tm.fm-apac_cd.inbound&utm_source=gmail&utm_medium=organic >>>> >>> >>
