Hi all, Manually bumping this thread.
Best Regards, Jiunn-Yang > 黃竣陽 <[email protected]> 於 2026年6月17日 晚上9:17 寫道: > > Hello chia, > > Thanks for the feedback, I have updated the KIP. > > Best Regards, > Jiunn-Yang > >> Chia-Ping Tsai <[email protected]> 於 2026年6月17日 凌晨12:47 寫道: >> >> hi Jiunn-Yang >> >>> When the config is set on a cluster that has not yet been upgraded... >>> classification cannot occur... the consumer falls back to the base >>> auto.offset.reset for the affected partitions. No exception is thrown, and >>> no operational disruption results. >> >> Existing group can't take advantage of this excellent new configuration. >> Allowing users to modify the group creation time might be overkill. Instead, >> we could print a useful warning message to guide users. For example, we can >> suggest that they re-create the group with their existing committed offsets >> >>> Protocol changes >> >> Would you mind listing those RPC changes in a table format? >> >>> The full interaction matrix between the base policy and the new-partition >>> policy is: >> >> Please add a filed to describe the target scenario when using these policies >> >> Best, >> Chia-Ping >> >> >> On 2026/06/16 16:14:49 黃竣陽 wrote: >>> Hello Jun, chia, >>> >>> Thanks for the feedback, I have updated the KIP for the new >>> approach, PTAL >>> >>> Best Regards, >>> Jiunn-Yang >>> >>>> Chia-Ping Tsai <[email protected]> 於 2026年6月16日 上午8:23 寫道: >>>> >>>> hi Jun >>>> >>>> Yes, your approach is great. I think the combination of latest (for >>>> existing partitions) and by_duration (for new partitions) can address 99% >>>> of the complaints I have heard regarding this issue. >>>> >>>> Also, leveraging the group creation time here opens the door to >>>> implementing a new policy based on timestamp seek in the future, should >>>> the community want to pursue that. >>>> >>>> Thanks for your patience and constructive feedback. We will update the KIP >>>> accordingly. >>>> >>>> Best, Chia-Ping >>>> >>>>> Jun Rao via dev <[email protected]> 於 2026年6月16日 清晨5:11 寫道: >>>>> >>>>> Hi, Chia-Ping, >>>>> >>>>> Thanks for the reply. >>>>> >>>>> I agree that it's probably useful to allow a user to configure a different >>>>> offset policy for existing partitions vs new partitions. However, using >>>>> group creation time to capture that seems more intuitive. Here is another >>>>> proposal: remove auto.offset.reset.max.age.ms and categorize new >>>>> partitions >>>>> based on group creation time. Introduce >>>>> a new config auto.offset.reset.new.partitions whose values can be >>>>> earliest, >>>>> latest and by_duration, the same as auto.offset.reset. Users can set >>>>> `auto.offset.reset.new.partitions` to `earliest` if they want to guarantee >>>>> no data loss on new partitions. They can also use by_duration to set an >>>>> upper bound on the backlog replayed, which can be different from that of >>>>> the existing partitions. This will address your concern about too much >>>>> backlog being replayed when the offsets are lost. What do you think? >>>>> >>>>> Jun >>>>> >>>>> >>>>>> On Mon, Jun 15, 2026 at 10:39 AM Chia-Ping Tsai <[email protected]> >>>>>> wrote: >>>>>> >>>>>> hi Jun >>>>>> >>>>>> The most important part of this story is how users should expect the data >>>>>> they can see when using the latest or by_duration policy with expanded >>>>>> partitions. >>>>>> >>>>>> Yes, the by_duration policy can minimize data loss, but it is >>>>>> non-deterministic, which means users will either read too many historical >>>>>> records from existing partitions or lose some records from expanded >>>>>> partitions. >>>>>> >>>>>> Also, I agree that auto.offset.reset.max.age.ms >>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.max.age.ms__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpwnwknH$> >>>>>> is a bit hard to understand, and that is why I preferred having a whole >>>>>> new >>>>>> policy based entirely on group creation time (KIP-1282) >>>>>> >>>>>> Best, >>>>>> Chia-Ping >>>>>> >>>>>> Jun Rao via dev <[email protected]> 於 2026年6月16日週二 上午1:08寫道: >>>>>> >>>>>>> Hi, Chia-Ping and Jiunn-Yang, >>>>>>> >>>>>>> Thanks for the reply. I am still trying to understand the value of the >>>>>>> new >>>>>>> configs with the KIP. >>>>>>> >>>>>>> The motivation of the KIP is that a user doesn't want to miss the data >>>>>>> if >>>>>>> the backlog is small. The backlog of the existing partition is easy to >>>>>>> understand because it relates to retention time. The backlog for the new >>>>>>> partition is a bit subtle to understand since it depends on the metadata >>>>>>> refresh delay. To set auto.offset.reset.max.age.ms >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.max.age.ms__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpwnwknH$>, >>>>>>> the user needs to >>>>>>> understand the metadata refresh delay on the consumer side and use it to >>>>>>> set the config. >>>>>>> >>>>>>> Now, let's consider the alternative: setting the same value for the >>>>>>> existing by_duration policy. The KIP lists three issues with this >>>>>>> approach. >>>>>>> 1. It computes the seek target client-side as now() - duration, which >>>>>>> introduces clock skew across consumers and forces operators to choose >>>>>>> overly large durations, causing unnecessary reprocessing. >>>>>>> 2. The target timestamp is recomputed on each retry, so failed >>>>>>> ListOffsetsRequest retries can shift the target forward and potentially >>>>>>> miss records produced between attempts. >>>>>>> 3. It applies uniformly to all partitions without committed offsets, and >>>>>>> cannot distinguish newly expanded partitions from long-existing >>>>>>> partitions >>>>>>> newly assigned to the group, leading to unnecessary replay. >>>>>>> >>>>>>> Issues 1 and 2 are uncommon and can be mitigated by adding a bit buffer >>>>>>> to >>>>>>> the metadata refresh delay. We could also consider improving the >>>>>>> implementation. For issue 3, the metadata refresh delay is typically low >>>>>>> (in the order of minutes with the classic consumer and tens of seconds >>>>>>> with >>>>>>> the new consumer). If a user is ok with reading that much backlog for >>>>>>> new >>>>>>> partitions, it seems they will be ok doing the same for existing >>>>>>> partitions. >>>>>>> >>>>>>> So, instead of introducing a new config, could we just reuse the >>>>>>> existing >>>>>>> config with better documentation and/or implementation? >>>>>>> >>>>>>> Jun >>>>>>> >>>>>>> >>>>>>>> On Sat, Jun 13, 2026 at 12:19 AM 黃竣陽 <[email protected]> wrote: >>>>>>> >>>>>>>> Hello Jun, >>>>>>>> >>>>>>>> You're right that group creation time is the more intuitive answer at >>>>>>>> first glance, >>>>>>>> the KIP's own motivation talks about partitions that "predate the >>>>>>>> group" >>>>>>>> vs partitions >>>>>>>> "created during group runtime," which directly points to a >>>>>>> group-lifecycle >>>>>>>> classifier. >>>>>>>> I'd like to walk through why we landed on partition age, and the >>>>>>>> trade-offs we considered. >>>>>>>> >>>>>>>> We evaluated three candidate signals: >>>>>>>> >>>>>>>> 1. `by_duration:5secs` >>>>>>>> >>>>>>>> This covers the metadata blindness window, but has issues the KIP >>>>>>>> currently documents >>>>>>>> under "Why not use `by_duration`?": >>>>>>>> >>>>>>>> - Client-side `now() - duration` introduces clock skew across >>>>>>>> consumers. >>>>>>>> - `ListOffsets` retries shift the target forward, potentially missing >>>>>>>> records produced between >>>>>>>> attempts. >>>>>>>> - It applies uniformly to all partitions without committed offsets, >>>>>>>> including pre-existing partitions >>>>>>>> newly assigned to the group, causing unnecessary replay. >>>>>>>> >>>>>>>> 2. Group creation time as classifier >>>>>>>> >>>>>>>> This works cleanly when the consumer is actively running. Our concern >>>>>>>> is the idle / late-rejoin case: >>>>>>>> >>>>>>>> T=0: Group created. >>>>>>>> T=1..T=100: Consumer idle (down, disconnected, etc.). >>>>>>>> T=50: Partition added during the idle window. >>>>>>>> T=100: Consumer resumes. >>>>>>>> >>>>>>>> Under group creation time, the new partition is classified as new >>>>>>>> (`50 > 0`) and reset to `earliest`, replaying everything from T=50. >>>>>>>> But during `[T=1, T=100]`, base partitions also accumulated data that >>>>>>>> the consumer accepts as lost — that is precisely the contract of >>>>>>>> `auto.offset.reset=latest`. There is no principled reason to treat >>>>>>>> the new partition differently; both contain backlog accumulated during >>>>>>>> the same idle window. >>>>>>>> >>>>>>>> This aligns with the "backlog is backlog” principle you raised in >>>>>>>> the KIP-1282 thread: a `latest` user has tolerated some backlog on >>>>>>>> every other partition during the same idle period; forcing 0-backlog >>>>>>>> tolerance only on new partitions would be inconsistent with that >>>>>>>> tolerance. >>>>>>>> >>>>>>>> 3. Partition age vs threshold >>>>>>>> >>>>>>>> Partition age corresponds to the actual silent data loss window, >>>>>>>> the gap between partition creation and the consumer’s metadata >>>>>>>> refresh. Within this window, data loss is genuinely silent: the >>>>>>>> consumer had no opportunity to know about the partition. Outside this >>>>>>>> window, missing data reflects either: >>>>>>>> >>>>>>>> - (a) the user’s tolerated cost of running with idle consumers, or >>>>>>>> - (b) an operational issue to surface via monitoring, not via reset >>>>>>> policy. >>>>>>>> >>>>>>>> We did not choose partition age because it is more elegant than group >>>>>>>> creation time — we chose it because its failure mode (requires a >>>>>>>> threshold) is >>>>>>>> less invasive than the failure mode of group creation time (overrides >>>>>>>> user-stated >>>>>>>> `latest` intent during idle periods). >>>>>>>> >>>>>>>> Best Regards, >>>>>>>> Jiunn-Yang >>>>>>>> >>>>>>>>> Chia-Ping Tsai <[email protected]> 於 2026年6月13日 上午11:52 寫道: >>>>>>>>> >>>>>>>>> Hi Jun, >>>>>>>>> >>>>>>>>> Relying on both creation times will create an inconsistent scenario. A >>>>>>>>> consumer that lost all offsets due to a long sleep will seek to the >>>>>>>>> beginning for the partitions created later than the group. >>>>>>>>> >>>>>>>>> That is why we initially proposed KIP-1282 to fix the inconsistency >>>>>>>> using a >>>>>>>>> whole new policy. Since KIP-1282 couldn't reach a consensus, KIP-1327 >>>>>>>> goes >>>>>>>>> back to using flexible configurations to prevent users from falling >>>>>>> into >>>>>>>>> that pitfall. >>>>>>>>> >>>>>>>>> Best, Chia-Ping >>>>>>>>> >>>>>>>>> Jun Rao via dev <[email protected]> 於 2026年6月13日週六 上午6:49寫道: >>>>>>>>> >>>>>>>>>> Hi, Jiunn-Yang, >>>>>>>>>> >>>>>>>>>> Thanks for the reply and sorry for the late reply. >>>>>>>>>> >>>>>>>>>> JR1. The design of auto.offset.reset.max.age.ms >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.max.age.ms__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpwnwknH$> >>>>>>> still feels weird to >>>>>>>> me. >>>>>>>>>> It >>>>>>>>>> categorizes partitions as new or existing based on the partition >>>>>>>> creation >>>>>>>>>> time. Intuitively, the categorization should be based on the group >>>>>>>> creation >>>>>>>>>> time: all partitions existing when the group is created are existing >>>>>>> and >>>>>>>>>> all partitions created after the group creation are new partitions. >>>>>>>>>> >>>>>>>>>> Jun >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Jun 9, 2026 at 8:51 AM 黃竣陽 <[email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> Manually bumping this thread. If there is no further >>>>>>>>>>> discussion, I will close the vote. >>>>>>>>>>> >>>>>>>>>>> Best Regards, >>>>>>>>>>> Jiunn-Yang >>>>>>>>>>> >>>>>>>>>>>> 黃竣陽 <[email protected]> 於 2026年6月1日 晚上7:16 寫道: >>>>>>>>>>>> >>>>>>>>>>>> Hello Jian, >>>>>>>>>>>> >>>>>>>>>>>> Thanks for your feedback, >>>>>>>>>>>> >>>>>>>>>>>> Agreed, partition expansion is a common operational task, not an >>>>>>> edge >>>>>>>>>>>> case. I've updated the Motivation section accordingly. >>>>>>>>>>>> >>>>>>>>>>>> Best Regards, >>>>>>>>>>>> Jiunn-Yang >>>>>>>>>>>> >>>>>>>>>>>>> jian fu <[email protected]> 於 2026年6月1日 下午5:49 寫道: >>>>>>>>>>>>> >>>>>>>>>>>>> Hi Jiunn-Yang: >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for the KIP. I think it would be useful to clarify that >>>>>>> this >>>>>>>>>> is a >>>>>>>>>>>>> common scenario rather than an edge case, which further >>>>>>> demonstrates >>>>>>>>>> the >>>>>>>>>>>>> need for this optimization. For example: >>>>>>>>>>>>> A partition expansion is a common operational task in Kafka: To >>>>>>>>>> balance >>>>>>>>>>>>> resource utilization and cost, topics are typically created with a >>>>>>>>>>> moderate >>>>>>>>>>>>> default partition count. However, as traffic grows over time, it >>>>>>> is >>>>>>>>>>> often >>>>>>>>>>>>> necessary to increase the number of partitions to accommodate the >>>>>>>>>> higher >>>>>>>>>>>>> workload. >>>>>>>>>>>>> >>>>>>>>>>>>> Regards >>>>>>>>>>>>> Jian >>>>>>>>>>>>> >>>>>>>>>>>>> 黃竣陽 <[email protected]> 于2026年5月30日周六 22:31写道: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hello chia, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks for the comments, I have updated the KIP! >>>>>>>>>>>>>> >>>>>>>>>>>>>> Best Regards, >>>>>>>>>>>>>> Jiunn-Yang >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Chia-Ping Tsai <[email protected]> 於 2026年5月30日 晚上8:29 寫道: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Jiunn-Yang, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Would you mind removing the terms "hot" and "cold" when >>>>>>> describing >>>>>>>>>>>>>>> partitions in the KIP? I understand you are using them to >>>>>>> describe >>>>>>>>>> the >>>>>>>>>>>>>>> "freshness" or the users' need for the records, but applying >>>>>>> these >>>>>>>>>>> terms >>>>>>>>>>>>>> to >>>>>>>>>>>>>>> the partition itself feels a bit unnatural. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> After all, in this scenario, users don't really care whether a >>>>>>>>>>> partition >>>>>>>>>>>>>> is >>>>>>>>>>>>>>> newly expanded or not. Their only expectation is that they won't >>>>>>>>>>> silently >>>>>>>>>>>>>>> lose any live records produced to the topic during their active >>>>>>>>>>>>>> consumption. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Best, Chia-Ping >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 黃竣陽 <[email protected]> 於 2026年5月30日週六 下午12:30寫道: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hello Jun, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks for the feedback, I have updated the KIP motivation >>>>>>>> section. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Best Regards, >>>>>>>>>>>>>>>> Jiunn-Yang >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Jun Rao via dev <[email protected]> 於 2026年5月30日 凌晨1:12 >>>>>>> 寫道: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi, Jiunn-Yang, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks for the reply. I think we need a stronger motivation >>>>>>> for >>>>>>>>>> the >>>>>>>>>>>>>> KIP. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The KIP says "The core insight is that not all partitions >>>>>>> without >>>>>>>>>> a >>>>>>>>>>>>>>>>> committed offset are the same. A newly expanded partition >>>>>>> (hot) >>>>>>>> is >>>>>>>>>>>>>>>>> fundamentally different from a partition the consumer has >>>>>>> never >>>>>>>>>> seen >>>>>>>>>>>>>>>>> because it predates the group (cold)." Why is the hot >>>>>>> partition >>>>>>>>>>>>>>>>> fundamentally different from the cold? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The KIP says "The existing by_duration policy is also >>>>>>>> insufficient >>>>>>>>>>>>>>>> because: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> - The calculated seek time (now() - duration) varies across >>>>>>> nodes >>>>>>>>>>> due >>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>> clock skew. To be safe, users must set an overly large >>>>>>> duration, >>>>>>>>>>>>>>>> causing >>>>>>>>>>>>>>>>> unnecessary reprocessing. >>>>>>>>>>>>>>>>> - On network errors, the client recalculates the seek time on >>>>>>>>>> retry, >>>>>>>>>>>>>>>>> shifting the target timestamp forward and risking data loss." >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> However, both of these situations are rare. If these issues >>>>>>>>>> persist, >>>>>>>>>>>>>> more >>>>>>>>>>>>>>>>> severe problems likely exist elsewhere. Rare situations don't >>>>>>>>>> need a >>>>>>>>>>>>>>>> common >>>>>>>>>>>>>>>>> solution. If users care about those rare situations, they can >>>>>>>>>>> implement >>>>>>>>>>>>>>>>> customized logic using >>>>>>>>>>>>>> ConsumerRebalanceListener.onPartitionsAssigned(). >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Jun >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Sun, May 17, 2026 at 6:50 AM 黃竣陽 <[email protected]> >>>>>>> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hello chia, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks for the feedback, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> If the creation time exists, the returned value should >>>>>>> always >>>>>>>> be >>>>>>>>>>>>>>>> greater >>>>>>>>>>>>>>>>>> than or equal to zero, right? >>>>>>>>>>>>>>>>>> I have explicitly mentioned this in the KIP. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> New Old (MetadataResponse v0–13) positive any >>>>>>>>>>> field >>>>>>>>>>>>>>>>>> absent UnsupportedVersionException >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The earliest point at which we can detect the version >>>>>>> mismatch >>>>>>>> is >>>>>>>>>>>>>> during >>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>> first metadata fetch after assignment, which occurs inside >>>>>>>>>> poll(). >>>>>>>>>>>>>>>>>> Therefore, the >>>>>>>>>>>>>>>>>> user would encounter an UnsupportedVersionException from >>>>>>> poll(). >>>>>>>>>>> I’ll >>>>>>>>>>>>>>>>>> clarify this in the KIP. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Best Regards, >>>>>>>>>>>>>>>>>> Jiunn-Yang >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Chia-Ping Tsai <[email protected]> 於 2026年5月17日 下午4:50 >>>>>>> 寫道: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> hi Jiunn >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> PartitionAgeMs (int64, default -1): The age of this >>>>>>> partition >>>>>>>>>> in >>>>>>>>>>>>>>>>>> milliseconds, computed server-side by the broker as >>>>>>>>>>>>>> broker_current_time >>>>>>>>>>>>>>>> - >>>>>>>>>>>>>>>>>> partition_creation_time. Returns -1 if the broker does not >>>>>>>>>> support >>>>>>>>>>>>>> this >>>>>>>>>>>>>>>>>> feature or the partition creation time is unknown. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> If the creation time exists, the returned value should >>>>>>> always >>>>>>>> be >>>>>>>>>>>>>>>> greater >>>>>>>>>>>>>>>>>> than or equal to zero, right? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> New Old (MetadataResponse v0–13) positive any >>>>>>>>>>> field >>>>>>>>>>>>>>>>>> absent UnsupportedVersionException >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Will user encounter UnsupportedVersionException when calling >>>>>>>>>>>>>> `poll()`? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>>>>> Chia-Ping >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 2026/05/16 04:30:49 黃竣陽 wrote: >>>>>>>>>>>>>>>>>>>> Hello Jun, chia, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I've updated KIP-1327 with a design change based on the >>>>>>>>>>> discussion >>>>>>>>>>>>>>>>>>>> feedback. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> The updated design decouples the new-partition reset >>>>>>> behavior >>>>>>>>>>> from >>>>>>>>>>>>>>>>>>>> the base auto.offset.reset policy: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> - auto.offset.reset.max.age.ms >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.max.age.ms__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpwnwknH$> >>>>>>> now applies to all >>>>>>>>>>> auto.offset.reset >>>>>>>>>>>>>>>>>> values >>>>>>>>>>>>>>>>>>>> (latest, earliest, by_duration, none). >>>>>>>>>>>>>>>>>>>> - For new ("hot") partitions, the consumer resets to >>>>>>>>>>>>>>>>>> auto.offset.reset.new >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$> >>>>>>> .partitions >>>>>>>>>>>>>>>>>>>> config setting >>>>>>>>>>>>>>>>>>>> - For existing ("cold") partitions, the base >>>>>>> auto.offset.reset >>>>>>>>>>>>>> policy >>>>>>>>>>>>>>>>>> continues >>>>>>>>>>>>>>>>>>>> to apply unchanged. >>>>>>>>>>>>>>>>>>>> - The new-partition reset behavior is represented by a >>>>>>>> separate >>>>>>>>>>>>>>>>>> internal config >>>>>>>>>>>>>>>>>>>> (auto.offset.reset.new >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$>.partitions, >>>>>>> currently fixed to >>>>>>>>>> earliest). >>>>>>>>>>>>>> This >>>>>>>>>>>>>>>>>> decoupled design makes >>>>>>>>>>>>>>>>>>>> it straightforward to promote the behavior to a public >>>>>>>>>>> user-facing >>>>>>>>>>>>>>>>>> configuration in a future KIP. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Best Regards, >>>>>>>>>>>>>>>>>>>> Jiunn-Yang >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Chia-Ping Tsai <[email protected]> 於 2026年5月16日 清晨7:46 >>>>>>> 寫道: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> hi Jun >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I see what you mean now. The proposal from me is listed >>>>>>>> below: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> 1) Add auto.offset.reset.new >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$>.partitions >>>>>>> with a default value >>>>>>>>>> of >>>>>>>>>>>>>>>>>> earliest. It fixes the data loss from both by_duration and >>>>>>>>>> latest, >>>>>>>>>>> and >>>>>>>>>>>>>>>> it >>>>>>>>>>>>>>>>>> does not change the logic of auto.offset.reset=earliest. >>>>>>>>>>>>>>>>>>>>> 2) Mark auto.offset.reset.new >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$>.partitions >>>>>>> as an internal >>>>>>>>>>>>>>>>>> configuration. auto.offset.reset.new >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$> >>>>>>> .partitions=earliest >>>>>>>> already >>>>>>>>>>>>>>>>>> addresses the issue, and we can discuss the use cases of >>>>>>> other >>>>>>>>>>> values >>>>>>>>>>>>>>>> in a >>>>>>>>>>>>>>>>>> separate KIP. >>>>>>>>>>>>>>>>>>>>> 3) Both configs, auto.offset.reset.new >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$>.partitions >>>>>>> and >>>>>>>>>>>>>>>>>> auto.offset.reset.latest.max.age.ms >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.latest.max.age.ms__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfu9JSP4l$>, >>>>>>> will be applied to all for >>>>>>>>>>>>>>>>>> consistency. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> WDYT? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 2026/05/15 20:53:20 Jun Rao via dev wrote: >>>>>>>>>>>>>>>>>>>>>> Hi, Chia-Ping, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thanks for the reply. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> 1. In the motivation section, the KIP says "When a Kafka >>>>>>>>>> topic >>>>>>>>>>> is >>>>>>>>>>>>>>>>>> expanded >>>>>>>>>>>>>>>>>>>>>> with new partitions, consumers using the latest auto >>>>>>> offset >>>>>>>>>>> reset >>>>>>>>>>>>>>>>>> policy >>>>>>>>>>>>>>>>>>>>>> will silently miss all records produced to those >>>>>>> partitions >>>>>>>>>>> before >>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>> consumer discovers them.". If a user sets >>>>>>>>>>>>>>>>>>>>>> auto.offset.reset=by_duration=1sec, the same record loss >>>>>>>>>> issue >>>>>>>>>>>>>> could >>>>>>>>>>>>>>>>>> also >>>>>>>>>>>>>>>>>>>>>> happen, right? >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> 2. I was thinking auto.offset.reset.new >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$>.partitions >>>>>>> will >>>>>>>> take >>>>>>>>>>> the >>>>>>>>>>>>>>>> same >>>>>>>>>>>>>>>>>>>>>> values as auto.offset.reset. So a user could set it >>>>>>>>>>> by_duration if >>>>>>>>>>>>>>>>>> needed. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Jun >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Thu, May 14, 2026 at 4:06 PM Chia-Ping Tsai < >>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> hi Jun >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks for the feedback. I might be missing something >>>>>>>>>>> important >>>>>>>>>>>>>>>> from >>>>>>>>>>>>>>>>>> your >>>>>>>>>>>>>>>>>>>>>>> suggestion, so please bear with me as I try to clarify >>>>>>> with >>>>>>>>>> a >>>>>>>>>>> few >>>>>>>>>>>>>>>>>> questions: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> 1. Is there a strong use case for extending this logic >>>>>>> to >>>>>>>>>>> other >>>>>>>>>>>>>>>> reset >>>>>>>>>>>>>>>>>>>>>>> policies? Unlike latest, policies like earliest or >>>>>>>>>> by_duration >>>>>>>>>>>>>>>> don't >>>>>>>>>>>>>>>>>> seem >>>>>>>>>>>>>>>>>>>>>>> to suffer from the same silent data loss issue when a >>>>>>>>>>> partition >>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>> expanded. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> 2. What values would we expect users to configure for >>>>>>>>>>>>>>>>>>>>>>> auto.offset.reset.new >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$>.partitions? >>>>>>> If they set it to >>>>>>>>>> earliest >>>>>>>>>>> or >>>>>>>>>>>>>>>>>> latest, >>>>>>>>>>>>>>>>>>>>>>> we might run into the exact same edge cases. For >>>>>>> example, >>>>>>>>>> if a >>>>>>>>>>>>>>>>>> consumer is >>>>>>>>>>>>>>>>>>>>>>> offline for a while and a new partition is created >>>>>>> during >>>>>>>>>> that >>>>>>>>>>>>>>>>>> downtime, >>>>>>>>>>>>>>>>>>>>>>> the user might actually want to skip to latest when >>>>>>>>>> resuming, >>>>>>>>>>>>>>>> rather >>>>>>>>>>>>>>>>>> than >>>>>>>>>>>>>>>>>>>>>>> reading from earliest just because the partition is >>>>>>>>>>> technically >>>>>>>>>>>>>>>>>> "new" to >>>>>>>>>>>>>>>>>>>>>>> the group. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> This is exactly why we opted for introducing a max.age >>>>>>>>>>> threshold. >>>>>>>>>>>>>>>> It >>>>>>>>>>>>>>>>>> gives >>>>>>>>>>>>>>>>>>>>>>> users a time-bound way to define what is genuinely >>>>>>>> "hot/new" >>>>>>>>>>> and >>>>>>>>>>>>>>>>>> what is >>>>>>>>>>>>>>>>>>>>>>> just an old partition they haven't seen yet. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>>>>>>>>> Chia-Ping >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On 2026/05/14 20:48:09 Jun Rao via dev wrote: >>>>>>>>>>>>>>>>>>>>>>>> Hi, Jiunn-Yang, >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Thanks for the KIP. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> I find auto.offset.reset.latest.max.age a bit weird. It >>>>>>>>>> only >>>>>>>>>>>>>>>>>> applies when >>>>>>>>>>>>>>>>>>>>>>>> auto.offset.reset is latest. However, it seems that the >>>>>>>>>>>>>> motivation >>>>>>>>>>>>>>>>>>>>>>> equally >>>>>>>>>>>>>>>>>>>>>>>> applies when auto.offset.reset is set to other values >>>>>>> like >>>>>>>>>>>>>>>>>> by_duration. >>>>>>>>>>>>>>>>>>>>>>> The >>>>>>>>>>>>>>>>>>>>>>>> intention is that we want to have a separate way to >>>>>>>> control >>>>>>>>>>>>>> newly >>>>>>>>>>>>>>>>>> created >>>>>>>>>>>>>>>>>>>>>>>> partitions vs existing partitions when the group >>>>>>> starts. >>>>>>>>>>> Have we >>>>>>>>>>>>>>>>>>>>>>> considered >>>>>>>>>>>>>>>>>>>>>>>> adding a new config like auto.offset.reset.new >>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$> >>>>>>>> .partitions? >>>>>>>>>>> If >>>>>>>>>>>>>>>> this >>>>>>>>>>>>>>>>>> new >>>>>>>>>>>>>>>>>>>>>>>> config is not set, the offset reset policy defaults to >>>>>>> the >>>>>>>>>>>>>> policy >>>>>>>>>>>>>>>>>> used >>>>>>>>>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>>>>>>> existing partitions. The user could set it explicitly >>>>>>> to >>>>>>>>>>>>>> customize >>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>> behavior for new partitions. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Jun >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On Thu, May 7, 2026 at 5:07 AM 黃竣陽 <[email protected] >>>>>>>> >>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I’d like to manually bump this thread. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Best Regards, >>>>>>>>>>>>>>>>>>>>>>>>> Jiunn-Yang >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> 黃竣陽 <[email protected]> 於 2026年5月1日 晚上10:37 寫道: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Hello all, >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the feedback. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> DJ01/DJ02: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> MetadataResponse bumps from v13 to v14. The >>>>>>>>>>> PartitionMetadata >>>>>>>>>>>>>>>>>> struct >>>>>>>>>>>>>>>>>>>>>>>>> gains a new >>>>>>>>>>>>>>>>>>>>>>>>>> field PartitionAgeMs (int64, default -1), computed >>>>>>>>>>> server-side >>>>>>>>>>>>>>>> by >>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>> broker as >>>>>>>>>>>>>>>>>>>>>>>>>> broker_current_time - partition_creation_time. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Also add the consumer heartbeat flow. when >>>>>>>>>>> MembershipManager >>>>>>>>>>>>>>>>>> detects >>>>>>>>>>>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>>>>> newly assigned >>>>>>>>>>>>>>>>>>>>>>>>>> partition, it explicitly invalidates the metadata for >>>>>>>> the >>>>>>>>>>>>>>>> affected >>>>>>>>>>>>>>>>>>>>>>> topic >>>>>>>>>>>>>>>>>>>>>>>>> and forces a fresh MetadataRequest >>>>>>>>>>>>>>>>>>>>>>>>>> before making the offset reset decision, even if the >>>>>>>>>> topic >>>>>>>>>>> ID >>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>>>>>> already >>>>>>>>>>>>>>>>>>>>>>>>> in the cache. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> MB0: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> The consumer learns the broker's maximum supported >>>>>>>>>>>>>>>>>> MetadataResponse >>>>>>>>>>>>>>>>>>>>>>>>> version via the >>>>>>>>>>>>>>>>>>>>>>>>>> ApiVersions negotiation at connection time. If the >>>>>>>>>>> negotiated >>>>>>>>>>>>>>>>>>>>>>> version is >>>>>>>>>>>>>>>>>>>>>>>>> unsupported, the consumer >>>>>>>>>>>>>>>>>>>>>>>>>> knows the broker does not support PartitionAgeMs at >>>>>>> all >>>>>>>>>> and >>>>>>>>>>>>>> can >>>>>>>>>>>>>>>>>>>>>>> throw an >>>>>>>>>>>>>>>>>>>>>>>>> UnsupportedVersionException >>>>>>>>>>>>>>>>>>>>>>>>>> immediately, rather than silently falling back to >>>>>>> latest >>>>>>>>>>> and >>>>>>>>>>>>>>>>>> risking >>>>>>>>>>>>>>>>>>>>>>>>> data loss without any operator-visible signal. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> MB1/MB2/MB3: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> I have addressed these changes in the KIP. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Best Regards, >>>>>>>>>>>>>>>>>>>>>>>>>> Jiunn-Yang >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Chia-Ping Tsai <[email protected]> 於 2026年4月29日 >>>>>>>> 下午4:04 >>>>>>>>>>> 寫道: >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> hi David >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> I agree with the direction of moving the 'age' >>>>>>>>>> resolution >>>>>>>>>>>>>> from >>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>> Heartbeat API to the Metadata API to keep the control >>>>>>>>>> plane >>>>>>>>>>>>>>>> clean. >>>>>>>>>>>>>>>>>> The >>>>>>>>>>>>>>>>>>>>>>> main >>>>>>>>>>>>>>>>>>>>>>>>> trade-off, as we noted before, is introducing >>>>>>>> inter-broker >>>>>>>>>>>>>> clock >>>>>>>>>>>>>>>>>> skew. >>>>>>>>>>>>>>>>>>>>>>> The >>>>>>>>>>>>>>>>>>>>>>>>> Group Coordinator approach provided a single source of >>>>>>>>>> truth >>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>> time. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> However, realistically, this time skew should be >>>>>>>>>>> negligible. >>>>>>>>>>>>>>>>>> Given >>>>>>>>>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>>>>>>>>>> the max.age threshold will likely be configured in >>>>>>>> minutes >>>>>>>>>>> or >>>>>>>>>>>>>>>>>> hours, a >>>>>>>>>>>>>>>>>>>>>>>>> typical NTP skew (in milliseconds) between brokers >>>>>>> won't >>>>>>>>>>> impact >>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>> fallback decision. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>>>>>>>>>>>>> Chia-Ping >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> David Jacot via dev <[email protected]> 於 >>>>>>>>>> 2026年4月29日 >>>>>>>>>>>>>>>> 下午3:29 >>>>>>>>>>>>>>>>>> 寫道: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the KIP! >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Sorry, I haven't really followed the previous >>>>>>>>>>> conversation >>>>>>>>>>>>>>>> but I >>>>>>>>>>>>>>>>>>>>>>> took a >>>>>>>>>>>>>>>>>>>>>>>>>>>> quick look at this one. >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> DJ01: I don't clearly understand the flow with the >>>>>>>>>>>>>>>>>>>>>>>>> ConsumerGroupHeartbeat >>>>>>>>>>>>>>>>>>>>>>>>>>>> API after reading the KIP. There is a new boolean; >>>>>>> the >>>>>>>>>>> KIP >>>>>>>>>>>>>>>>>> states >>>>>>>>>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>>>>>>>>>>>>> partition ages are returned only when this boolean >>>>>>> is >>>>>>>>>>> set. >>>>>>>>>>>>>>>>>>>>>>> Implicitly, >>>>>>>>>>>>>>>>>>>>>>>>> this >>>>>>>>>>>>>>>>>>>>>>>>>>>> means that when the consumer receives a new >>>>>>> partition, >>>>>>>>>> it >>>>>>>>>>>>>> will >>>>>>>>>>>>>>>>>>>>>>> issue a >>>>>>>>>>>>>>>>>>>>>>>>> new >>>>>>>>>>>>>>>>>>>>>>>>>>>> HB request with the boolean set to receive the >>>>>>> ages. >>>>>>>> Is >>>>>>>>>>> my >>>>>>>>>>>>>>>>>>>>>>>>> understanding >>>>>>>>>>>>>>>>>>>>>>>>>>>> correct? We should perhaps clarify the flow and >>>>>>> also >>>>>>>>>>> explain >>>>>>>>>>>>>>>>>> how it >>>>>>>>>>>>>>>>>>>>>>>>> fits >>>>>>>>>>>>>>>>>>>>>>>>>>>> into the existing flow (e.g. list offsets, fetch >>>>>>>>>> offsets, >>>>>>>>>>>>>>>> etc.). >>>>>>>>>>>>>>>>>>>>>>>>>>>> DJ02: It my understanding is correct, I wonder if >>>>>>>>>>>>>>>>>>>>>>>>>>>> the ConsumerGroupHeartbeat API is the right place >>>>>>> for >>>>>>>>>>> this >>>>>>>>>>>>>>>> given >>>>>>>>>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>>>>>>>>>> a new >>>>>>>>>>>>>>>>>>>>>>>>>>>> round trip is done anyway. Alternatively, it could >>>>>>>>>> simply >>>>>>>>>>>>>>>>>> include >>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>> metadata. Generally, we should be rather cautious >>>>>>>> about >>>>>>>>>>> not >>>>>>>>>>>>>>>>>>>>>>> overloading >>>>>>>>>>>>>>>>>>>>>>>>>>>> the ConsumerGroupHeartbeat API with unrelated >>>>>>>> concepts. >>>>>>>>>>> The >>>>>>>>>>>>>>>> API >>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>>>>>>>> control plane API for assigning or revoking >>>>>>>> partitions. >>>>>>>>>>> The >>>>>>>>>>>>>>>> fact >>>>>>>>>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>>>>>>>>>> we >>>>>>>>>>>>>>>>>>>>>>>>>>>> don't want to add it to the corresponding Streams >>>>>>> API >>>>>>>>>>> also >>>>>>>>>>>>>>>>>> suggests >>>>>>>>>>>>>>>>>>>>>>>>>>>> something is not quite right. What would we do if >>>>>>> we >>>>>>>>>>> want to >>>>>>>>>>>>>>>>>>>>>>> support >>>>>>>>>>>>>>>>>>>>>>>>>>>> Streams in the future? >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Apr 29, 2026 at 12:28 AM Muralidhar Basani >>>>>>>> via >>>>>>>>>>> dev >>>>>>>>>>>>>> < >>>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jiunn, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you for this great kip. Good to know about >>>>>>> the >>>>>>>>>>> gap. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> mb-0 - why a new v2 version bump for >>>>>>>>>>> RequestPartitionAges >>>>>>>>>>>>>>>>>> field. >>>>>>>>>>>>>>>>>>>>>>> Can a >>>>>>>>>>>>>>>>>>>>>>>>>>>>> tagged field (for ex: on response, PartitionAges >>>>>>> on >>>>>>>>>>>>>>>>>>>>>>> TopicPartitions) >>>>>>>>>>>>>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>>>>>>>>> used here and avoid version bump? >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> mb-1 - For the new config, is there a recommended >>>>>>>>>> value >>>>>>>>>>> or >>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>>> ConfigDef >>>>>>>>>>>>>>>>>>>>>>>>>>>>> validator? Probably it should based on the >>>>>>>>>>>>>>>> metadata.max.age.ms >>>>>>> <https://urldefense.com/v3/__http://metadata.max.age.ms__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkflKEb5SK$> >>>>>>>>>>>>>>>>>> ? >>>>>>>>>>>>>>>>>>>>>>>>> Sizing >>>>>>>>>>>>>>>>>>>>>>>>>>>>> instructions can be part of javadocs I guess. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> mb-2 - (minor) As there are no changes to Kafka >>>>>>>>>> Streams, >>>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>> it >>>>>>>>>>>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>>>>> better >>>>>>>>>>>>>>>>>>>>>>>>>>>>> to add this new config >>>>>>>>>> auto.offset.reset.latest.max.age >>>>>>>>>>> to >>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>> StreamsConfig block list >>>>>>>>>>>>>>>>>>>>>>> (NON_CONFIGURABLE_CONSUMER_DEFAULT_CONFIGS) >>>>>>>>>>>>>>>>>>>>>>>>> for a >>>>>>>>>>>>>>>>>>>>>>>>>>>>> clear warning, incase users configure it? This is >>>>>>> the >>>>>>>>>>> most >>>>>>>>>>>>>>>>>>>>>>> familiar >>>>>>>>>>>>>>>>>>>>>>>>>>>>> consumer config and users might easily mistakenly >>>>>>>>>>> configure >>>>>>>>>>>>>>>>>> it. Or >>>>>>>>>>>>>>>>>>>>>>>>> may be >>>>>>>>>>>>>>>>>>>>>>>>>>>>> it's not worth it to add. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> mb-3 - (minor) The phrasing "the consumer falls >>>>>>> back >>>>>>>>>> to >>>>>>>>>>>>>>>>>> earliest" >>>>>>>>>>>>>>>>>>>>>>>>> reads as >>>>>>>>>>>>>>>>>>>>>>>>>>>>> if the config were being changed per-partition >>>>>>> which >>>>>>>>>>> isn't >>>>>>>>>>>>>>>>>>>>>>> supported. >>>>>>>>>>>>>>>>>>>>>>>>> May >>>>>>>>>>>>>>>>>>>>>>>>>>>>> be rephrasing to something like "consumer resolves >>>>>>>> the >>>>>>>>>>>>>>>> initial >>>>>>>>>>>>>>>>>>>>>>>>> position to >>>>>>>>>>>>>>>>>>>>>>>>>>>>> start offset for that partition" as if earliest >>>>>>> was >>>>>>>>>>> applied >>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>>>>>>>>>>>>>> partition only and auto.offset.reset config is >>>>>>>>>>> unchanged. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Murali >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Apr 28, 2026 at 2:48 PM 黃竣陽 < >>>>>>>>>>> [email protected]> >>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi chia, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I have updated the KIP to include this change. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best Regards, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiunn-Yang >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Chia-Ping Tsai <[email protected]> 於 >>>>>>> 2026年4月28日 >>>>>>>>>>> 晚上8:03 >>>>>>>>>>>>>>>> 寫道: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> hi Jiunn-Yang >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> chia_0: Should we expose the partition creation >>>>>>>> time >>>>>>>>>>> via >>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>> Admin >>>>>>>>>>>>>>>>>>>>>>>>> API? >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I assume it would be valuable for users to >>>>>>> diagnose >>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>>>> troubleshoot >>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> behavior of auto.offset.reset.latest.max.age >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Chia-Ping >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 2026/04/28 10:47:58 黃竣陽 wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hello everyone, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would like to start a discussion on KIP-1327 >>>>>>>>>>> Prevent >>>>>>>>>>>>>> Hot >>>>>>>>>>>>>>>>>> Data >>>>>>>>>>>>>>>>>>>>>>>>> Loss >>>>>>>>>>>>>>>>>>>>>>>>>>>>> on >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Partition Expansion for Latest Policy >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> < >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/x/KY4mGQ__;!!Ayb5sqE7!qF4q1QzF1RRgP61D7A2xuEai1ky7fepKDKFFvpNBuePikH-ULmT87TvuuZzy5kau5E4y5zMZAmfQQiwZomM$ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This proposal aims to introduces >>>>>>>>>>>>>>>>>>>>>>> auto.offset.reset.latest.max.age, >>>>>>>>>>>>>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> consumer config that lets the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> latest reset policy distinguish newly expanded >>>>>>>>>> (hot) >>>>>>>>>>>>>>>>>> partitions >>>>>>>>>>>>>>>>>>>>>>>>> from >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> long-existing (cold) ones. Partitions >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> younger than the configured threshold >>>>>>>> automatically >>>>>>>>>>> fall >>>>>>>>>>>>>>>>>> back >>>>>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> earliest, preventing silent data loss >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> during topic expansion without forcing a full >>>>>>>>>>> historical >>>>>>>>>>>>>>>>>>>>>>> reprocess. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best regards, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiunn-Yang >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>> >>> >>> >
