Hi Jiunn-Yang, I've just been re-reading this KIP to ensure that share group support is not needed. I've discovered that we made a mistake implementing KIP-932 in this area, so we'll rectify that and check whether it's really tight enough not to require KIP-1327.
One trivial comments for consistency. AS1: In kafka-consumer-groups.sh, unknown column data is represented by "-", not "N/A". In kafka-topics.sh, we do use "N/A". Thanks, Andrew On 2026/06/23 23:23:38 黃竣陽 wrote: > Hi all, > > Manually bumping this thread. > > Best Regards, > Jiunn-Yang > > > 黃竣陽 <[email protected]> 於 2026年6月17日 晚上9:17 寫道: > > > > Hello chia, > > > > Thanks for the feedback, I have updated the KIP. > > > > Best Regards, > > Jiunn-Yang > > > >> Chia-Ping Tsai <[email protected]> 於 2026年6月17日 凌晨12:47 寫道: > >> > >> hi Jiunn-Yang > >> > >>> When the config is set on a cluster that has not yet been upgraded... > >>> classification cannot occur... the consumer falls back to the base > >>> auto.offset.reset for the affected partitions. No exception is thrown, > >>> and no operational disruption results. > >> > >> Existing group can't take advantage of this excellent new configuration. > >> Allowing users to modify the group creation time might be overkill. > >> Instead, we could print a useful warning message to guide users. For > >> example, we can suggest that they re-create the group with their existing > >> committed offsets > >> > >>> Protocol changes > >> > >> Would you mind listing those RPC changes in a table format? > >> > >>> The full interaction matrix between the base policy and the new-partition > >>> policy is: > >> > >> Please add a filed to describe the target scenario when using these > >> policies > >> > >> Best, > >> Chia-Ping > >> > >> > >> On 2026/06/16 16:14:49 黃竣陽 wrote: > >>> Hello Jun, chia, > >>> > >>> Thanks for the feedback, I have updated the KIP for the new > >>> approach, PTAL > >>> > >>> Best Regards, > >>> Jiunn-Yang > >>> > >>>> Chia-Ping Tsai <[email protected]> 於 2026年6月16日 上午8:23 寫道: > >>>> > >>>> hi Jun > >>>> > >>>> Yes, your approach is great. I think the combination of latest (for > >>>> existing partitions) and by_duration (for new partitions) can address > >>>> 99% of the complaints I have heard regarding this issue. > >>>> > >>>> Also, leveraging the group creation time here opens the door to > >>>> implementing a new policy based on timestamp seek in the future, should > >>>> the community want to pursue that. > >>>> > >>>> Thanks for your patience and constructive feedback. We will update the > >>>> KIP accordingly. > >>>> > >>>> Best, Chia-Ping > >>>> > >>>>> Jun Rao via dev <[email protected]> 於 2026年6月16日 清晨5:11 寫道: > >>>>> > >>>>> Hi, Chia-Ping, > >>>>> > >>>>> Thanks for the reply. > >>>>> > >>>>> I agree that it's probably useful to allow a user to configure a > >>>>> different > >>>>> offset policy for existing partitions vs new partitions. However, using > >>>>> group creation time to capture that seems more intuitive. Here is > >>>>> another > >>>>> proposal: remove auto.offset.reset.max.age.ms and categorize new > >>>>> partitions > >>>>> based on group creation time. Introduce > >>>>> a new config auto.offset.reset.new.partitions whose values can be > >>>>> earliest, > >>>>> latest and by_duration, the same as auto.offset.reset. Users can set > >>>>> `auto.offset.reset.new.partitions` to `earliest` if they want to > >>>>> guarantee > >>>>> no data loss on new partitions. They can also use by_duration to set an > >>>>> upper bound on the backlog replayed, which can be different from that of > >>>>> the existing partitions. This will address your concern about too much > >>>>> backlog being replayed when the offsets are lost. What do you think? > >>>>> > >>>>> Jun > >>>>> > >>>>> > >>>>>> On Mon, Jun 15, 2026 at 10:39 AM Chia-Ping Tsai <[email protected]> > >>>>>> wrote: > >>>>>> > >>>>>> hi Jun > >>>>>> > >>>>>> The most important part of this story is how users should expect the > >>>>>> data > >>>>>> they can see when using the latest or by_duration policy with expanded > >>>>>> partitions. > >>>>>> > >>>>>> Yes, the by_duration policy can minimize data loss, but it is > >>>>>> non-deterministic, which means users will either read too many > >>>>>> historical > >>>>>> records from existing partitions or lose some records from expanded > >>>>>> partitions. > >>>>>> > >>>>>> Also, I agree that auto.offset.reset.max.age.ms > >>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.max.age.ms__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpwnwknH$> > >>>>>> is a bit hard to understand, and that is why I preferred having a > >>>>>> whole new > >>>>>> policy based entirely on group creation time (KIP-1282) > >>>>>> > >>>>>> Best, > >>>>>> Chia-Ping > >>>>>> > >>>>>> Jun Rao via dev <[email protected]> 於 2026年6月16日週二 上午1:08寫道: > >>>>>> > >>>>>>> Hi, Chia-Ping and Jiunn-Yang, > >>>>>>> > >>>>>>> Thanks for the reply. I am still trying to understand the value of > >>>>>>> the new > >>>>>>> configs with the KIP. > >>>>>>> > >>>>>>> The motivation of the KIP is that a user doesn't want to miss the > >>>>>>> data if > >>>>>>> the backlog is small. The backlog of the existing partition is easy to > >>>>>>> understand because it relates to retention time. The backlog for the > >>>>>>> new > >>>>>>> partition is a bit subtle to understand since it depends on the > >>>>>>> metadata > >>>>>>> refresh delay. To set auto.offset.reset.max.age.ms > >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.max.age.ms__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpwnwknH$>, > >>>>>>> the user needs to > >>>>>>> understand the metadata refresh delay on the consumer side and use it > >>>>>>> to > >>>>>>> set the config. > >>>>>>> > >>>>>>> Now, let's consider the alternative: setting the same value for the > >>>>>>> existing by_duration policy. The KIP lists three issues with this > >>>>>>> approach. > >>>>>>> 1. It computes the seek target client-side as now() - duration, which > >>>>>>> introduces clock skew across consumers and forces operators to choose > >>>>>>> overly large durations, causing unnecessary reprocessing. > >>>>>>> 2. The target timestamp is recomputed on each retry, so failed > >>>>>>> ListOffsetsRequest retries can shift the target forward and > >>>>>>> potentially > >>>>>>> miss records produced between attempts. > >>>>>>> 3. It applies uniformly to all partitions without committed offsets, > >>>>>>> and > >>>>>>> cannot distinguish newly expanded partitions from long-existing > >>>>>>> partitions > >>>>>>> newly assigned to the group, leading to unnecessary replay. > >>>>>>> > >>>>>>> Issues 1 and 2 are uncommon and can be mitigated by adding a bit > >>>>>>> buffer to > >>>>>>> the metadata refresh delay. We could also consider improving the > >>>>>>> implementation. For issue 3, the metadata refresh delay is typically > >>>>>>> low > >>>>>>> (in the order of minutes with the classic consumer and tens of seconds > >>>>>>> with > >>>>>>> the new consumer). If a user is ok with reading that much backlog for > >>>>>>> new > >>>>>>> partitions, it seems they will be ok doing the same for existing > >>>>>>> partitions. > >>>>>>> > >>>>>>> So, instead of introducing a new config, could we just reuse the > >>>>>>> existing > >>>>>>> config with better documentation and/or implementation? > >>>>>>> > >>>>>>> Jun > >>>>>>> > >>>>>>> > >>>>>>>> On Sat, Jun 13, 2026 at 12:19 AM 黃竣陽 <[email protected]> wrote: > >>>>>>> > >>>>>>>> Hello Jun, > >>>>>>>> > >>>>>>>> You're right that group creation time is the more intuitive answer at > >>>>>>>> first glance, > >>>>>>>> the KIP's own motivation talks about partitions that "predate the > >>>>>>>> group" > >>>>>>>> vs partitions > >>>>>>>> "created during group runtime," which directly points to a > >>>>>>> group-lifecycle > >>>>>>>> classifier. > >>>>>>>> I'd like to walk through why we landed on partition age, and the > >>>>>>>> trade-offs we considered. > >>>>>>>> > >>>>>>>> We evaluated three candidate signals: > >>>>>>>> > >>>>>>>> 1. `by_duration:5secs` > >>>>>>>> > >>>>>>>> This covers the metadata blindness window, but has issues the KIP > >>>>>>>> currently documents > >>>>>>>> under "Why not use `by_duration`?": > >>>>>>>> > >>>>>>>> - Client-side `now() - duration` introduces clock skew across > >>>>>>>> consumers. > >>>>>>>> - `ListOffsets` retries shift the target forward, potentially missing > >>>>>>>> records produced between > >>>>>>>> attempts. > >>>>>>>> - It applies uniformly to all partitions without committed offsets, > >>>>>>>> including pre-existing partitions > >>>>>>>> newly assigned to the group, causing unnecessary replay. > >>>>>>>> > >>>>>>>> 2. Group creation time as classifier > >>>>>>>> > >>>>>>>> This works cleanly when the consumer is actively running. Our concern > >>>>>>>> is the idle / late-rejoin case: > >>>>>>>> > >>>>>>>> T=0: Group created. > >>>>>>>> T=1..T=100: Consumer idle (down, disconnected, etc.). > >>>>>>>> T=50: Partition added during the idle window. > >>>>>>>> T=100: Consumer resumes. > >>>>>>>> > >>>>>>>> Under group creation time, the new partition is classified as new > >>>>>>>> (`50 > 0`) and reset to `earliest`, replaying everything from T=50. > >>>>>>>> But during `[T=1, T=100]`, base partitions also accumulated data that > >>>>>>>> the consumer accepts as lost — that is precisely the contract of > >>>>>>>> `auto.offset.reset=latest`. There is no principled reason to treat > >>>>>>>> the new partition differently; both contain backlog accumulated > >>>>>>>> during > >>>>>>>> the same idle window. > >>>>>>>> > >>>>>>>> This aligns with the "backlog is backlog” principle you raised in > >>>>>>>> the KIP-1282 thread: a `latest` user has tolerated some backlog on > >>>>>>>> every other partition during the same idle period; forcing 0-backlog > >>>>>>>> tolerance only on new partitions would be inconsistent with that > >>>>>>>> tolerance. > >>>>>>>> > >>>>>>>> 3. Partition age vs threshold > >>>>>>>> > >>>>>>>> Partition age corresponds to the actual silent data loss window, > >>>>>>>> the gap between partition creation and the consumer’s metadata > >>>>>>>> refresh. Within this window, data loss is genuinely silent: the > >>>>>>>> consumer had no opportunity to know about the partition. Outside this > >>>>>>>> window, missing data reflects either: > >>>>>>>> > >>>>>>>> - (a) the user’s tolerated cost of running with idle consumers, or > >>>>>>>> - (b) an operational issue to surface via monitoring, not via reset > >>>>>>> policy. > >>>>>>>> > >>>>>>>> We did not choose partition age because it is more elegant than group > >>>>>>>> creation time — we chose it because its failure mode (requires a > >>>>>>>> threshold) is > >>>>>>>> less invasive than the failure mode of group creation time (overrides > >>>>>>>> user-stated > >>>>>>>> `latest` intent during idle periods). > >>>>>>>> > >>>>>>>> Best Regards, > >>>>>>>> Jiunn-Yang > >>>>>>>> > >>>>>>>>> Chia-Ping Tsai <[email protected]> 於 2026年6月13日 上午11:52 寫道: > >>>>>>>>> > >>>>>>>>> Hi Jun, > >>>>>>>>> > >>>>>>>>> Relying on both creation times will create an inconsistent > >>>>>>>>> scenario. A > >>>>>>>>> consumer that lost all offsets due to a long sleep will seek to the > >>>>>>>>> beginning for the partitions created later than the group. > >>>>>>>>> > >>>>>>>>> That is why we initially proposed KIP-1282 to fix the inconsistency > >>>>>>>> using a > >>>>>>>>> whole new policy. Since KIP-1282 couldn't reach a consensus, > >>>>>>>>> KIP-1327 > >>>>>>>> goes > >>>>>>>>> back to using flexible configurations to prevent users from falling > >>>>>>> into > >>>>>>>>> that pitfall. > >>>>>>>>> > >>>>>>>>> Best, Chia-Ping > >>>>>>>>> > >>>>>>>>> Jun Rao via dev <[email protected]> 於 2026年6月13日週六 上午6:49寫道: > >>>>>>>>> > >>>>>>>>>> Hi, Jiunn-Yang, > >>>>>>>>>> > >>>>>>>>>> Thanks for the reply and sorry for the late reply. > >>>>>>>>>> > >>>>>>>>>> JR1. The design of auto.offset.reset.max.age.ms > >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.max.age.ms__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpwnwknH$> > >>>>>>> still feels weird to > >>>>>>>> me. > >>>>>>>>>> It > >>>>>>>>>> categorizes partitions as new or existing based on the partition > >>>>>>>> creation > >>>>>>>>>> time. Intuitively, the categorization should be based on the group > >>>>>>>> creation > >>>>>>>>>> time: all partitions existing when the group is created are > >>>>>>>>>> existing > >>>>>>> and > >>>>>>>>>> all partitions created after the group creation are new partitions. > >>>>>>>>>> > >>>>>>>>>> Jun > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Tue, Jun 9, 2026 at 8:51 AM 黃竣陽 <[email protected]> wrote: > >>>>>>>>>> > >>>>>>>>>>> Hi all, > >>>>>>>>>>> > >>>>>>>>>>> Manually bumping this thread. If there is no further > >>>>>>>>>>> discussion, I will close the vote. > >>>>>>>>>>> > >>>>>>>>>>> Best Regards, > >>>>>>>>>>> Jiunn-Yang > >>>>>>>>>>> > >>>>>>>>>>>> 黃竣陽 <[email protected]> 於 2026年6月1日 晚上7:16 寫道: > >>>>>>>>>>>> > >>>>>>>>>>>> Hello Jian, > >>>>>>>>>>>> > >>>>>>>>>>>> Thanks for your feedback, > >>>>>>>>>>>> > >>>>>>>>>>>> Agreed, partition expansion is a common operational task, not an > >>>>>>> edge > >>>>>>>>>>>> case. I've updated the Motivation section accordingly. > >>>>>>>>>>>> > >>>>>>>>>>>> Best Regards, > >>>>>>>>>>>> Jiunn-Yang > >>>>>>>>>>>> > >>>>>>>>>>>>> jian fu <[email protected]> 於 2026年6月1日 下午5:49 寫道: > >>>>>>>>>>>>> > >>>>>>>>>>>>> Hi Jiunn-Yang: > >>>>>>>>>>>>> > >>>>>>>>>>>>> Thanks for the KIP. I think it would be useful to clarify that > >>>>>>> this > >>>>>>>>>> is a > >>>>>>>>>>>>> common scenario rather than an edge case, which further > >>>>>>> demonstrates > >>>>>>>>>> the > >>>>>>>>>>>>> need for this optimization. For example: > >>>>>>>>>>>>> A partition expansion is a common operational task in Kafka: To > >>>>>>>>>> balance > >>>>>>>>>>>>> resource utilization and cost, topics are typically created > >>>>>>>>>>>>> with a > >>>>>>>>>>> moderate > >>>>>>>>>>>>> default partition count. However, as traffic grows over time, it > >>>>>>> is > >>>>>>>>>>> often > >>>>>>>>>>>>> necessary to increase the number of partitions to accommodate > >>>>>>>>>>>>> the > >>>>>>>>>> higher > >>>>>>>>>>>>> workload. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Regards > >>>>>>>>>>>>> Jian > >>>>>>>>>>>>> > >>>>>>>>>>>>> 黃竣陽 <[email protected]> 于2026年5月30日周六 22:31写道: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> Hello chia, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Thanks for the comments, I have updated the KIP! > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Best Regards, > >>>>>>>>>>>>>> Jiunn-Yang > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Chia-Ping Tsai <[email protected]> 於 2026年5月30日 晚上8:29 寫道: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Hi Jiunn-Yang, > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Would you mind removing the terms "hot" and "cold" when > >>>>>>> describing > >>>>>>>>>>>>>>> partitions in the KIP? I understand you are using them to > >>>>>>> describe > >>>>>>>>>> the > >>>>>>>>>>>>>>> "freshness" or the users' need for the records, but applying > >>>>>>> these > >>>>>>>>>>> terms > >>>>>>>>>>>>>> to > >>>>>>>>>>>>>>> the partition itself feels a bit unnatural. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> After all, in this scenario, users don't really care whether a > >>>>>>>>>>> partition > >>>>>>>>>>>>>> is > >>>>>>>>>>>>>>> newly expanded or not. Their only expectation is that they > >>>>>>>>>>>>>>> won't > >>>>>>>>>>> silently > >>>>>>>>>>>>>>> lose any live records produced to the topic during their > >>>>>>>>>>>>>>> active > >>>>>>>>>>>>>> consumption. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Best, Chia-Ping > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> 黃竣陽 <[email protected]> 於 2026年5月30日週六 下午12:30寫道: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Hello Jun, > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Thanks for the feedback, I have updated the KIP motivation > >>>>>>>> section. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Best Regards, > >>>>>>>>>>>>>>>> Jiunn-Yang > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Jun Rao via dev <[email protected]> 於 2026年5月30日 凌晨1:12 > >>>>>>> 寫道: > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Hi, Jiunn-Yang, > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Thanks for the reply. I think we need a stronger motivation > >>>>>>> for > >>>>>>>>>> the > >>>>>>>>>>>>>> KIP. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> The KIP says "The core insight is that not all partitions > >>>>>>> without > >>>>>>>>>> a > >>>>>>>>>>>>>>>>> committed offset are the same. A newly expanded partition > >>>>>>> (hot) > >>>>>>>> is > >>>>>>>>>>>>>>>>> fundamentally different from a partition the consumer has > >>>>>>> never > >>>>>>>>>> seen > >>>>>>>>>>>>>>>>> because it predates the group (cold)." Why is the hot > >>>>>>> partition > >>>>>>>>>>>>>>>>> fundamentally different from the cold? > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> The KIP says "The existing by_duration policy is also > >>>>>>>> insufficient > >>>>>>>>>>>>>>>> because: > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> - The calculated seek time (now() - duration) varies across > >>>>>>> nodes > >>>>>>>>>>> due > >>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>> clock skew. To be safe, users must set an overly large > >>>>>>> duration, > >>>>>>>>>>>>>>>> causing > >>>>>>>>>>>>>>>>> unnecessary reprocessing. > >>>>>>>>>>>>>>>>> - On network errors, the client recalculates the seek time > >>>>>>>>>>>>>>>>> on > >>>>>>>>>> retry, > >>>>>>>>>>>>>>>>> shifting the target timestamp forward and risking data > >>>>>>>>>>>>>>>>> loss." > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> However, both of these situations are rare. If these issues > >>>>>>>>>> persist, > >>>>>>>>>>>>>> more > >>>>>>>>>>>>>>>>> severe problems likely exist elsewhere. Rare situations > >>>>>>>>>>>>>>>>> don't > >>>>>>>>>> need a > >>>>>>>>>>>>>>>> common > >>>>>>>>>>>>>>>>> solution. If users care about those rare situations, they > >>>>>>>>>>>>>>>>> can > >>>>>>>>>>> implement > >>>>>>>>>>>>>>>>> customized logic using > >>>>>>>>>>>>>> ConsumerRebalanceListener.onPartitionsAssigned(). > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Jun > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> On Sun, May 17, 2026 at 6:50 AM 黃竣陽 <[email protected]> > >>>>>>> wrote: > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Hello chia, > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Thanks for the feedback, > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> If the creation time exists, the returned value should > >>>>>>> always > >>>>>>>> be > >>>>>>>>>>>>>>>> greater > >>>>>>>>>>>>>>>>>> than or equal to zero, right? > >>>>>>>>>>>>>>>>>> I have explicitly mentioned this in the KIP. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> New Old (MetadataResponse v0–13) positive any > >>>>>>>>>>> field > >>>>>>>>>>>>>>>>>> absent UnsupportedVersionException > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> The earliest point at which we can detect the version > >>>>>>> mismatch > >>>>>>>> is > >>>>>>>>>>>>>> during > >>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>> first metadata fetch after assignment, which occurs inside > >>>>>>>>>> poll(). > >>>>>>>>>>>>>>>>>> Therefore, the > >>>>>>>>>>>>>>>>>> user would encounter an UnsupportedVersionException from > >>>>>>> poll(). > >>>>>>>>>>> I’ll > >>>>>>>>>>>>>>>>>> clarify this in the KIP. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Best Regards, > >>>>>>>>>>>>>>>>>> Jiunn-Yang > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Chia-Ping Tsai <[email protected]> 於 2026年5月17日 下午4:50 > >>>>>>> 寫道: > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> hi Jiunn > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> PartitionAgeMs (int64, default -1): The age of this > >>>>>>> partition > >>>>>>>>>> in > >>>>>>>>>>>>>>>>>> milliseconds, computed server-side by the broker as > >>>>>>>>>>>>>> broker_current_time > >>>>>>>>>>>>>>>> - > >>>>>>>>>>>>>>>>>> partition_creation_time. Returns -1 if the broker does not > >>>>>>>>>> support > >>>>>>>>>>>>>> this > >>>>>>>>>>>>>>>>>> feature or the partition creation time is unknown. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> If the creation time exists, the returned value should > >>>>>>> always > >>>>>>>> be > >>>>>>>>>>>>>>>> greater > >>>>>>>>>>>>>>>>>> than or equal to zero, right? > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> New Old (MetadataResponse v0–13) positive any > >>>>>>>>>>> field > >>>>>>>>>>>>>>>>>> absent UnsupportedVersionException > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Will user encounter UnsupportedVersionException when > >>>>>>>>>>>>>>>>>>> calling > >>>>>>>>>>>>>> `poll()`? > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>> Chia-Ping > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> On 2026/05/16 04:30:49 黃竣陽 wrote: > >>>>>>>>>>>>>>>>>>>> Hello Jun, chia, > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> I've updated KIP-1327 with a design change based on the > >>>>>>>>>>> discussion > >>>>>>>>>>>>>>>>>>>> feedback. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> The updated design decouples the new-partition reset > >>>>>>> behavior > >>>>>>>>>>> from > >>>>>>>>>>>>>>>>>>>> the base auto.offset.reset policy: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> - auto.offset.reset.max.age.ms > >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.max.age.ms__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpwnwknH$> > >>>>>>> now applies to all > >>>>>>>>>>> auto.offset.reset > >>>>>>>>>>>>>>>>>> values > >>>>>>>>>>>>>>>>>>>> (latest, earliest, by_duration, none). > >>>>>>>>>>>>>>>>>>>> - For new ("hot") partitions, the consumer resets to > >>>>>>>>>>>>>>>>>> auto.offset.reset.new > >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$> > >>>>>>> .partitions > >>>>>>>>>>>>>>>>>>>> config setting > >>>>>>>>>>>>>>>>>>>> - For existing ("cold") partitions, the base > >>>>>>> auto.offset.reset > >>>>>>>>>>>>>> policy > >>>>>>>>>>>>>>>>>> continues > >>>>>>>>>>>>>>>>>>>> to apply unchanged. > >>>>>>>>>>>>>>>>>>>> - The new-partition reset behavior is represented by a > >>>>>>>> separate > >>>>>>>>>>>>>>>>>> internal config > >>>>>>>>>>>>>>>>>>>> (auto.offset.reset.new > >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$>.partitions, > >>>>>>> currently fixed to > >>>>>>>>>> earliest). > >>>>>>>>>>>>>> This > >>>>>>>>>>>>>>>>>> decoupled design makes > >>>>>>>>>>>>>>>>>>>> it straightforward to promote the behavior to a public > >>>>>>>>>>> user-facing > >>>>>>>>>>>>>>>>>> configuration in a future KIP. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Best Regards, > >>>>>>>>>>>>>>>>>>>> Jiunn-Yang > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Chia-Ping Tsai <[email protected]> 於 2026年5月16日 清晨7:46 > >>>>>>> 寫道: > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> hi Jun > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> I see what you mean now. The proposal from me is listed > >>>>>>>> below: > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> 1) Add auto.offset.reset.new > >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$>.partitions > >>>>>>> with a default value > >>>>>>>>>> of > >>>>>>>>>>>>>>>>>> earliest. It fixes the data loss from both by_duration and > >>>>>>>>>> latest, > >>>>>>>>>>> and > >>>>>>>>>>>>>>>> it > >>>>>>>>>>>>>>>>>> does not change the logic of auto.offset.reset=earliest. > >>>>>>>>>>>>>>>>>>>>> 2) Mark auto.offset.reset.new > >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$>.partitions > >>>>>>> as an internal > >>>>>>>>>>>>>>>>>> configuration. auto.offset.reset.new > >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$> > >>>>>>> .partitions=earliest > >>>>>>>> already > >>>>>>>>>>>>>>>>>> addresses the issue, and we can discuss the use cases of > >>>>>>> other > >>>>>>>>>>> values > >>>>>>>>>>>>>>>> in a > >>>>>>>>>>>>>>>>>> separate KIP. > >>>>>>>>>>>>>>>>>>>>> 3) Both configs, auto.offset.reset.new > >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$>.partitions > >>>>>>> and > >>>>>>>>>>>>>>>>>> auto.offset.reset.latest.max.age.ms > >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.latest.max.age.ms__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfu9JSP4l$>, > >>>>>>> will be applied to all for > >>>>>>>>>>>>>>>>>> consistency. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> WDYT? > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> On 2026/05/15 20:53:20 Jun Rao via dev wrote: > >>>>>>>>>>>>>>>>>>>>>> Hi, Chia-Ping, > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Thanks for the reply. > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> 1. In the motivation section, the KIP says "When a > >>>>>>>>>>>>>>>>>>>>>> Kafka > >>>>>>>>>> topic > >>>>>>>>>>> is > >>>>>>>>>>>>>>>>>> expanded > >>>>>>>>>>>>>>>>>>>>>> with new partitions, consumers using the latest auto > >>>>>>> offset > >>>>>>>>>>> reset > >>>>>>>>>>>>>>>>>> policy > >>>>>>>>>>>>>>>>>>>>>> will silently miss all records produced to those > >>>>>>> partitions > >>>>>>>>>>> before > >>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>> consumer discovers them.". If a user sets > >>>>>>>>>>>>>>>>>>>>>> auto.offset.reset=by_duration=1sec, the same record > >>>>>>>>>>>>>>>>>>>>>> loss > >>>>>>>>>> issue > >>>>>>>>>>>>>> could > >>>>>>>>>>>>>>>>>> also > >>>>>>>>>>>>>>>>>>>>>> happen, right? > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> 2. I was thinking auto.offset.reset.new > >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$>.partitions > >>>>>>> will > >>>>>>>> take > >>>>>>>>>>> the > >>>>>>>>>>>>>>>> same > >>>>>>>>>>>>>>>>>>>>>> values as auto.offset.reset. So a user could set it > >>>>>>>>>>> by_duration if > >>>>>>>>>>>>>>>>>> needed. > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Jun > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> On Thu, May 14, 2026 at 4:06 PM Chia-Ping Tsai < > >>>>>>>>>>>>>> [email protected] > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> hi Jun > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> Thanks for the feedback. I might be missing something > >>>>>>>>>>> important > >>>>>>>>>>>>>>>> from > >>>>>>>>>>>>>>>>>> your > >>>>>>>>>>>>>>>>>>>>>>> suggestion, so please bear with me as I try to clarify > >>>>>>> with > >>>>>>>>>> a > >>>>>>>>>>> few > >>>>>>>>>>>>>>>>>> questions: > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> 1. Is there a strong use case for extending this logic > >>>>>>> to > >>>>>>>>>>> other > >>>>>>>>>>>>>>>> reset > >>>>>>>>>>>>>>>>>>>>>>> policies? Unlike latest, policies like earliest or > >>>>>>>>>> by_duration > >>>>>>>>>>>>>>>> don't > >>>>>>>>>>>>>>>>>> seem > >>>>>>>>>>>>>>>>>>>>>>> to suffer from the same silent data loss issue when a > >>>>>>>>>>> partition > >>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>>>> expanded. > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> 2. What values would we expect users to configure for > >>>>>>>>>>>>>>>>>>>>>>> auto.offset.reset.new > >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$>.partitions? > >>>>>>> If they set it to > >>>>>>>>>> earliest > >>>>>>>>>>> or > >>>>>>>>>>>>>>>>>> latest, > >>>>>>>>>>>>>>>>>>>>>>> we might run into the exact same edge cases. For > >>>>>>> example, > >>>>>>>>>> if a > >>>>>>>>>>>>>>>>>> consumer is > >>>>>>>>>>>>>>>>>>>>>>> offline for a while and a new partition is created > >>>>>>> during > >>>>>>>>>> that > >>>>>>>>>>>>>>>>>> downtime, > >>>>>>>>>>>>>>>>>>>>>>> the user might actually want to skip to latest when > >>>>>>>>>> resuming, > >>>>>>>>>>>>>>>> rather > >>>>>>>>>>>>>>>>>> than > >>>>>>>>>>>>>>>>>>>>>>> reading from earliest just because the partition is > >>>>>>>>>>> technically > >>>>>>>>>>>>>>>>>> "new" to > >>>>>>>>>>>>>>>>>>>>>>> the group. > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> This is exactly why we opted for introducing a max.age > >>>>>>>>>>> threshold. > >>>>>>>>>>>>>>>> It > >>>>>>>>>>>>>>>>>> gives > >>>>>>>>>>>>>>>>>>>>>>> users a time-bound way to define what is genuinely > >>>>>>>> "hot/new" > >>>>>>>>>>> and > >>>>>>>>>>>>>>>>>> what is > >>>>>>>>>>>>>>>>>>>>>>> just an old partition they haven't seen yet. > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>>>>>> Chia-Ping > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> On 2026/05/14 20:48:09 Jun Rao via dev wrote: > >>>>>>>>>>>>>>>>>>>>>>>> Hi, Jiunn-Yang, > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> Thanks for the KIP. > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> I find auto.offset.reset.latest.max.age a bit weird. > >>>>>>>>>>>>>>>>>>>>>>>> It > >>>>>>>>>> only > >>>>>>>>>>>>>>>>>> applies when > >>>>>>>>>>>>>>>>>>>>>>>> auto.offset.reset is latest. However, it seems that > >>>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>> motivation > >>>>>>>>>>>>>>>>>>>>>>> equally > >>>>>>>>>>>>>>>>>>>>>>>> applies when auto.offset.reset is set to other values > >>>>>>> like > >>>>>>>>>>>>>>>>>> by_duration. > >>>>>>>>>>>>>>>>>>>>>>> The > >>>>>>>>>>>>>>>>>>>>>>>> intention is that we want to have a separate way to > >>>>>>>> control > >>>>>>>>>>>>>> newly > >>>>>>>>>>>>>>>>>> created > >>>>>>>>>>>>>>>>>>>>>>>> partitions vs existing partitions when the group > >>>>>>> starts. > >>>>>>>>>>> Have we > >>>>>>>>>>>>>>>>>>>>>>> considered > >>>>>>>>>>>>>>>>>>>>>>>> adding a new config like auto.offset.reset.new > >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$> > >>>>>>>> .partitions? > >>>>>>>>>>> If > >>>>>>>>>>>>>>>> this > >>>>>>>>>>>>>>>>>> new > >>>>>>>>>>>>>>>>>>>>>>>> config is not set, the offset reset policy defaults > >>>>>>>>>>>>>>>>>>>>>>>> to > >>>>>>> the > >>>>>>>>>>>>>> policy > >>>>>>>>>>>>>>>>>> used > >>>>>>>>>>>>>>>>>>>>>>> for > >>>>>>>>>>>>>>>>>>>>>>>> existing partitions. The user could set it explicitly > >>>>>>> to > >>>>>>>>>>>>>> customize > >>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>> behavior for new partitions. > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> Jun > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> On Thu, May 7, 2026 at 5:07 AM 黃竣陽 > >>>>>>>>>>>>>>>>>>>>>>>> <[email protected] > >>>>>>>> > >>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> Hi all, > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> I’d like to manually bump this thread. > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> Best Regards, > >>>>>>>>>>>>>>>>>>>>>>>>> Jiunn-Yang > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> 黃竣陽 <[email protected]> 於 2026年5月1日 晚上10:37 寫道: > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Hello all, > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the feedback. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> DJ01/DJ02: > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> MetadataResponse bumps from v13 to v14. The > >>>>>>>>>>> PartitionMetadata > >>>>>>>>>>>>>>>>>> struct > >>>>>>>>>>>>>>>>>>>>>>>>> gains a new > >>>>>>>>>>>>>>>>>>>>>>>>>> field PartitionAgeMs (int64, default -1), computed > >>>>>>>>>>> server-side > >>>>>>>>>>>>>>>> by > >>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>> broker as > >>>>>>>>>>>>>>>>>>>>>>>>>> broker_current_time - partition_creation_time. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Also add the consumer heartbeat flow. when > >>>>>>>>>>> MembershipManager > >>>>>>>>>>>>>>>>>> detects > >>>>>>>>>>>>>>>>>>>>>>> a > >>>>>>>>>>>>>>>>>>>>>>>>> newly assigned > >>>>>>>>>>>>>>>>>>>>>>>>>> partition, it explicitly invalidates the metadata > >>>>>>>>>>>>>>>>>>>>>>>>>> for > >>>>>>>> the > >>>>>>>>>>>>>>>> affected > >>>>>>>>>>>>>>>>>>>>>>> topic > >>>>>>>>>>>>>>>>>>>>>>>>> and forces a fresh MetadataRequest > >>>>>>>>>>>>>>>>>>>>>>>>>> before making the offset reset decision, even if > >>>>>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>> topic > >>>>>>>>>>> ID > >>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>>>>>>>>> already > >>>>>>>>>>>>>>>>>>>>>>>>> in the cache. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> MB0: > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> The consumer learns the broker's maximum supported > >>>>>>>>>>>>>>>>>> MetadataResponse > >>>>>>>>>>>>>>>>>>>>>>>>> version via the > >>>>>>>>>>>>>>>>>>>>>>>>>> ApiVersions negotiation at connection time. If the > >>>>>>>>>>> negotiated > >>>>>>>>>>>>>>>>>>>>>>> version is > >>>>>>>>>>>>>>>>>>>>>>>>> unsupported, the consumer > >>>>>>>>>>>>>>>>>>>>>>>>>> knows the broker does not support PartitionAgeMs at > >>>>>>> all > >>>>>>>>>> and > >>>>>>>>>>>>>> can > >>>>>>>>>>>>>>>>>>>>>>> throw an > >>>>>>>>>>>>>>>>>>>>>>>>> UnsupportedVersionException > >>>>>>>>>>>>>>>>>>>>>>>>>> immediately, rather than silently falling back to > >>>>>>> latest > >>>>>>>>>>> and > >>>>>>>>>>>>>>>>>> risking > >>>>>>>>>>>>>>>>>>>>>>>>> data loss without any operator-visible signal. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> MB1/MB2/MB3: > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> I have addressed these changes in the KIP. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Best Regards, > >>>>>>>>>>>>>>>>>>>>>>>>>> Jiunn-Yang > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Chia-Ping Tsai <[email protected]> 於 2026年4月29日 > >>>>>>>> 下午4:04 > >>>>>>>>>>> 寫道: > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> hi David > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> I agree with the direction of moving the 'age' > >>>>>>>>>> resolution > >>>>>>>>>>>>>> from > >>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>> Heartbeat API to the Metadata API to keep the > >>>>>>>>>>>>>>>>>>>>>>>>> control > >>>>>>>>>> plane > >>>>>>>>>>>>>>>> clean. > >>>>>>>>>>>>>>>>>> The > >>>>>>>>>>>>>>>>>>>>>>> main > >>>>>>>>>>>>>>>>>>>>>>>>> trade-off, as we noted before, is introducing > >>>>>>>> inter-broker > >>>>>>>>>>>>>> clock > >>>>>>>>>>>>>>>>>> skew. > >>>>>>>>>>>>>>>>>>>>>>> The > >>>>>>>>>>>>>>>>>>>>>>>>> Group Coordinator approach provided a single source > >>>>>>>>>>>>>>>>>>>>>>>>> of > >>>>>>>>>> truth > >>>>>>>>>>>>>> for > >>>>>>>>>>>>>>>>>> time. > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> However, realistically, this time skew should be > >>>>>>>>>>> negligible. > >>>>>>>>>>>>>>>>>> Given > >>>>>>>>>>>>>>>>>>>>>>> that > >>>>>>>>>>>>>>>>>>>>>>>>> the max.age threshold will likely be configured in > >>>>>>>> minutes > >>>>>>>>>>> or > >>>>>>>>>>>>>>>>>> hours, a > >>>>>>>>>>>>>>>>>>>>>>>>> typical NTP skew (in milliseconds) between brokers > >>>>>>> won't > >>>>>>>>>>> impact > >>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>> fallback decision. > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>>>>>>>>>> Chia-Ping > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> David Jacot via dev <[email protected]> 於 > >>>>>>>>>> 2026年4月29日 > >>>>>>>>>>>>>>>> 下午3:29 > >>>>>>>>>>>>>>>>>> 寫道: > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi all, > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the KIP! > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Sorry, I haven't really followed the previous > >>>>>>>>>>> conversation > >>>>>>>>>>>>>>>> but I > >>>>>>>>>>>>>>>>>>>>>>> took a > >>>>>>>>>>>>>>>>>>>>>>>>>>>> quick look at this one. > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> DJ01: I don't clearly understand the flow with > >>>>>>>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>> ConsumerGroupHeartbeat > >>>>>>>>>>>>>>>>>>>>>>>>>>>> API after reading the KIP. There is a new > >>>>>>>>>>>>>>>>>>>>>>>>>>>> boolean; > >>>>>>> the > >>>>>>>>>>> KIP > >>>>>>>>>>>>>>>>>> states > >>>>>>>>>>>>>>>>>>>>>>> that > >>>>>>>>>>>>>>>>>>>>>>>>>>>> partition ages are returned only when this > >>>>>>>>>>>>>>>>>>>>>>>>>>>> boolean > >>>>>>> is > >>>>>>>>>>> set. > >>>>>>>>>>>>>>>>>>>>>>> Implicitly, > >>>>>>>>>>>>>>>>>>>>>>>>> this > >>>>>>>>>>>>>>>>>>>>>>>>>>>> means that when the consumer receives a new > >>>>>>> partition, > >>>>>>>>>> it > >>>>>>>>>>>>>> will > >>>>>>>>>>>>>>>>>>>>>>> issue a > >>>>>>>>>>>>>>>>>>>>>>>>> new > >>>>>>>>>>>>>>>>>>>>>>>>>>>> HB request with the boolean set to receive the > >>>>>>> ages. > >>>>>>>> Is > >>>>>>>>>>> my > >>>>>>>>>>>>>>>>>>>>>>>>> understanding > >>>>>>>>>>>>>>>>>>>>>>>>>>>> correct? We should perhaps clarify the flow and > >>>>>>> also > >>>>>>>>>>> explain > >>>>>>>>>>>>>>>>>> how it > >>>>>>>>>>>>>>>>>>>>>>>>> fits > >>>>>>>>>>>>>>>>>>>>>>>>>>>> into the existing flow (e.g. list offsets, fetch > >>>>>>>>>> offsets, > >>>>>>>>>>>>>>>> etc.). > >>>>>>>>>>>>>>>>>>>>>>>>>>>> DJ02: It my understanding is correct, I wonder if > >>>>>>>>>>>>>>>>>>>>>>>>>>>> the ConsumerGroupHeartbeat API is the right place > >>>>>>> for > >>>>>>>>>>> this > >>>>>>>>>>>>>>>> given > >>>>>>>>>>>>>>>>>>>>>>> that > >>>>>>>>>>>>>>>>>>>>>>>>> a new > >>>>>>>>>>>>>>>>>>>>>>>>>>>> round trip is done anyway. Alternatively, it > >>>>>>>>>>>>>>>>>>>>>>>>>>>> could > >>>>>>>>>> simply > >>>>>>>>>>>>>>>>>> include > >>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>>>> metadata. Generally, we should be rather cautious > >>>>>>>> about > >>>>>>>>>>> not > >>>>>>>>>>>>>>>>>>>>>>> overloading > >>>>>>>>>>>>>>>>>>>>>>>>>>>> the ConsumerGroupHeartbeat API with unrelated > >>>>>>>> concepts. > >>>>>>>>>>> The > >>>>>>>>>>>>>>>> API > >>>>>>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>>>>>>>>> a > >>>>>>>>>>>>>>>>>>>>>>>>>>>> control plane API for assigning or revoking > >>>>>>>> partitions. > >>>>>>>>>>> The > >>>>>>>>>>>>>>>> fact > >>>>>>>>>>>>>>>>>>>>>>> that > >>>>>>>>>>>>>>>>>>>>>>>>> we > >>>>>>>>>>>>>>>>>>>>>>>>>>>> don't want to add it to the corresponding Streams > >>>>>>> API > >>>>>>>>>>> also > >>>>>>>>>>>>>>>>>> suggests > >>>>>>>>>>>>>>>>>>>>>>>>>>>> something is not quite right. What would we do if > >>>>>>> we > >>>>>>>>>>> want to > >>>>>>>>>>>>>>>>>>>>>>> support > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Streams in the future? > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>>>>>>>>>>> David > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Apr 29, 2026 at 12:28 AM Muralidhar > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Basani > >>>>>>>> via > >>>>>>>>>>> dev > >>>>>>>>>>>>>> < > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jiunn, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you for this great kip. Good to know about > >>>>>>> the > >>>>>>>>>>> gap. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> mb-0 - why a new v2 version bump for > >>>>>>>>>>> RequestPartitionAges > >>>>>>>>>>>>>>>>>> field. > >>>>>>>>>>>>>>>>>>>>>>> Can a > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> tagged field (for ex: on response, PartitionAges > >>>>>>> on > >>>>>>>>>>>>>>>>>>>>>>> TopicPartitions) > >>>>>>>>>>>>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> used here and avoid version bump? > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> mb-1 - For the new config, is there a > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> recommended > >>>>>>>>>> value > >>>>>>>>>>> or > >>>>>>>>>>>>>> a > >>>>>>>>>>>>>>>>>>>>>>> ConfigDef > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> validator? Probably it should based on the > >>>>>>>>>>>>>>>> metadata.max.age.ms > >>>>>>> <https://urldefense.com/v3/__http://metadata.max.age.ms__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkflKEb5SK$> > >>>>>>>>>>>>>>>>>> ? > >>>>>>>>>>>>>>>>>>>>>>>>> Sizing > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> instructions can be part of javadocs I guess. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> mb-2 - (minor) As there are no changes to Kafka > >>>>>>>>>> Streams, > >>>>>>>>>>>>>>>> would > >>>>>>>>>>>>>>>>>> it > >>>>>>>>>>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>>>>>>> better > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> to add this new config > >>>>>>>>>> auto.offset.reset.latest.max.age > >>>>>>>>>>> to > >>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> StreamsConfig block list > >>>>>>>>>>>>>>>>>>>>>>> (NON_CONFIGURABLE_CONSUMER_DEFAULT_CONFIGS) > >>>>>>>>>>>>>>>>>>>>>>>>> for a > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> clear warning, incase users configure it? This > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> is > >>>>>>> the > >>>>>>>>>>> most > >>>>>>>>>>>>>>>>>>>>>>> familiar > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> consumer config and users might easily > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> mistakenly > >>>>>>>>>>> configure > >>>>>>>>>>>>>>>>>> it. Or > >>>>>>>>>>>>>>>>>>>>>>>>> may be > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> it's not worth it to add. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> mb-3 - (minor) The phrasing "the consumer falls > >>>>>>> back > >>>>>>>>>> to > >>>>>>>>>>>>>>>>>> earliest" > >>>>>>>>>>>>>>>>>>>>>>>>> reads as > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> if the config were being changed per-partition > >>>>>>> which > >>>>>>>>>>> isn't > >>>>>>>>>>>>>>>>>>>>>>> supported. > >>>>>>>>>>>>>>>>>>>>>>>>> May > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> be rephrasing to something like "consumer > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> resolves > >>>>>>>> the > >>>>>>>>>>>>>>>> initial > >>>>>>>>>>>>>>>>>>>>>>>>> position to > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> start offset for that partition" as if earliest > >>>>>>> was > >>>>>>>>>>> applied > >>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>> that > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> partition only and auto.offset.reset config is > >>>>>>>>>>> unchanged. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Murali > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Apr 28, 2026 at 2:48 PM 黃竣陽 < > >>>>>>>>>>> [email protected]> > >>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi chia, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I have updated the KIP to include this change. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best Regards, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiunn-Yang > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Chia-Ping Tsai <[email protected]> 於 > >>>>>>> 2026年4月28日 > >>>>>>>>>>> 晚上8:03 > >>>>>>>>>>>>>>>> 寫道: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> hi Jiunn-Yang > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> chia_0: Should we expose the partition > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> creation > >>>>>>>> time > >>>>>>>>>>> via > >>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>> Admin > >>>>>>>>>>>>>>>>>>>>>>>>> API? > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I assume it would be valuable for users to > >>>>>>> diagnose > >>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>>>> troubleshoot > >>>>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> behavior of auto.offset.reset.latest.max.age > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Chia-Ping > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 2026/04/28 10:47:58 黃竣陽 wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hello everyone, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would like to start a discussion on > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> KIP-1327 > >>>>>>>>>>> Prevent > >>>>>>>>>>>>>> Hot > >>>>>>>>>>>>>>>>>> Data > >>>>>>>>>>>>>>>>>>>>>>>>> Loss > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> on > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Partition Expansion for Latest Policy > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> < > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>> > >>>>>>> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/x/KY4mGQ__;!!Ayb5sqE7!qF4q1QzF1RRgP61D7A2xuEai1ky7fepKDKFFvpNBuePikH-ULmT87TvuuZzy5kau5E4y5zMZAmfQQiwZomM$ > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This proposal aims to introduces > >>>>>>>>>>>>>>>>>>>>>>> auto.offset.reset.latest.max.age, > >>>>>>>>>>>>>>>>>>>>>>>>> a > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> consumer config that lets the > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> latest reset policy distinguish newly > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> expanded > >>>>>>>>>> (hot) > >>>>>>>>>>>>>>>>>> partitions > >>>>>>>>>>>>>>>>>>>>>>>>> from > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> long-existing (cold) ones. Partitions > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> younger than the configured threshold > >>>>>>>> automatically > >>>>>>>>>>> fall > >>>>>>>>>>>>>>>>>> back > >>>>>>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> earliest, preventing silent data loss > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> during topic expansion without forcing a full > >>>>>>>>>>> historical > >>>>>>>>>>>>>>>>>>>>>>> reprocess. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best regards, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiunn-Yang > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>> > >>> > >>> > > > > >
