Hi Chirag, Thanks for the KIP. I have a few comments. AS1: The calculation of the lag needs to take into account offsets which are not occupied by records, such as when they’ve been removed due to compaction. Also, the offsets which correspond to control records need to be taken into account. Please update the text to make this clear.
AS2: The name InFlightTerminalRecords in the schemas seems a bit strange to me. What you are doing is calculating the offsets after the SPSO for which delivery is complete, either because the records are acknowledged or archived, or because they are control records, or because the offsets do not correspond to records at all. Personally, I only think of the in-flight records as being those between the SPSO and the SPEO which have one of the delivery states, not those which never did. I’ve been very careful to exclude the SPEO from the external interfaces, because one day I expect to change the code so that the in-flight records are sparse and the distance between the SPSO and the SPEO can be much greater. The concept of lag in this KIP needs to be flexible enough to accommodate this. I wonder whether a name like DeliveryCompleteCount or DeliveryCompleteRecords instead of InFlightTerminalRecords would be better. This is the number of offsets after the SPSO for which the records have completed delivery, either because they’re in a terminal state, or because no delivery is required. Wdyt? Thanks, Andrew > On 9 Oct 2025, at 14:43, CHIRAG WADHWA <[email protected]> wrote: > > I'd like to start the discussion for KIP-1226: Introducing Share Partition > Lag Persistence and Retrieval. > > KIP Wiki: > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1226:+Introducing+Share+Partition+Lag+Persistence+and+Retrieval > > Regards, > Chirag Wadhwa
