Hi all,

Manually bumping this thread. 

Best Regards,
Jiunn-Yang

> 黃竣陽 <[email protected]> 於 2026年6月17日 晚上9:17 寫道:
> 
> Hello chia,
> 
> Thanks for the feedback, I have updated the KIP.
> 
> Best Regards,
> Jiunn-Yang
> 
>> Chia-Ping Tsai <[email protected]> 於 2026年6月17日 凌晨12:47 寫道:
>> 
>> hi Jiunn-Yang
>> 
>>> When the config is set on a cluster that has not yet been upgraded... 
>>> classification cannot occur... the consumer falls back to the base 
>>> auto.offset.reset for the affected partitions. No exception is thrown, and 
>>> no operational disruption results.
>> 
>> Existing group can't take advantage of this excellent new configuration. 
>> Allowing users to modify the group creation time might be overkill. Instead, 
>> we could print a useful warning message to guide users. For example, we can 
>> suggest that they re-create the group with their existing committed offsets
>> 
>>> Protocol changes
>> 
>> Would you mind listing those RPC changes in a table format?
>> 
>>> The full interaction matrix between the base policy and the new-partition 
>>> policy is:
>> 
>> Please add a filed to describe the target scenario when using these policies
>> 
>> Best,
>> Chia-Ping
>> 
>> 
>> On 2026/06/16 16:14:49 黃竣陽 wrote:
>>> Hello Jun, chia,
>>> 
>>> Thanks for the feedback, I have updated the KIP for the new 
>>> approach, PTAL
>>> 
>>> Best Regards,
>>> Jiunn-Yang
>>> 
>>>> Chia-Ping Tsai <[email protected]> 於 2026年6月16日 上午8:23 寫道:
>>>> 
>>>> hi Jun
>>>> 
>>>> Yes, your approach is great. I think the combination of latest (for 
>>>> existing partitions) and by_duration (for new partitions) can address 99% 
>>>> of the complaints I have heard regarding this issue.
>>>> 
>>>> Also, leveraging the group creation time here opens the door to 
>>>> implementing a new policy based on timestamp seek in the future, should 
>>>> the community want to pursue that.
>>>> 
>>>> Thanks for your patience and constructive feedback. We will update the KIP 
>>>> accordingly.
>>>> 
>>>> Best, Chia-Ping
>>>> 
>>>>> Jun Rao via dev <[email protected]> 於 2026年6月16日 清晨5:11 寫道:
>>>>> 
>>>>> Hi, Chia-Ping,
>>>>> 
>>>>> Thanks for the reply.
>>>>> 
>>>>> I agree that it's probably useful to allow a user to configure a different
>>>>> offset policy for existing partitions vs new partitions. However, using
>>>>> group creation time to capture that seems more intuitive. Here is another
>>>>> proposal: remove auto.offset.reset.max.age.ms and categorize new 
>>>>> partitions
>>>>> based on group creation time. Introduce
>>>>> a new config auto.offset.reset.new.partitions whose values can be 
>>>>> earliest,
>>>>> latest and by_duration, the same as auto.offset.reset. Users can set
>>>>> `auto.offset.reset.new.partitions` to `earliest` if they want to guarantee
>>>>> no data loss on new partitions. They can also use by_duration to set an
>>>>> upper bound on the backlog replayed, which can be different from that of
>>>>> the existing partitions. This will address your concern about too much
>>>>> backlog being replayed when the offsets are lost. What do you think?
>>>>> 
>>>>> Jun
>>>>> 
>>>>> 
>>>>>> On Mon, Jun 15, 2026 at 10:39 AM Chia-Ping Tsai <[email protected]> 
>>>>>> wrote:
>>>>>> 
>>>>>> hi Jun
>>>>>> 
>>>>>> The most important part of this story is how users should expect the data
>>>>>> they can see when using the latest or by_duration policy with expanded
>>>>>> partitions.
>>>>>> 
>>>>>> Yes, the by_duration policy can minimize data loss, but it is
>>>>>> non-deterministic, which means users will either read too many historical
>>>>>> records from existing partitions or lose some records from expanded
>>>>>> partitions.
>>>>>> 
>>>>>> Also, I agree that auto.offset.reset.max.age.ms
>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.max.age.ms__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpwnwknH$>
>>>>>> is a bit hard to understand, and that is why I preferred having a whole 
>>>>>> new
>>>>>> policy based entirely on group creation time (KIP-1282)
>>>>>> 
>>>>>> Best,
>>>>>> Chia-Ping
>>>>>> 
>>>>>> Jun Rao via dev <[email protected]> 於 2026年6月16日週二 上午1:08寫道:
>>>>>> 
>>>>>>> Hi, Chia-Ping and Jiunn-Yang,
>>>>>>> 
>>>>>>> Thanks for the reply. I am still trying to understand the value of the 
>>>>>>> new
>>>>>>> configs with the KIP.
>>>>>>> 
>>>>>>> The motivation of the KIP is that a user doesn't want to miss the data 
>>>>>>> if
>>>>>>> the backlog is small. The backlog of the existing partition is easy to
>>>>>>> understand because it relates to retention time. The backlog for the new
>>>>>>> partition is a bit subtle to understand since it depends on the metadata
>>>>>>> refresh delay. To set auto.offset.reset.max.age.ms
>>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.max.age.ms__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpwnwknH$>,
>>>>>>> the user needs to
>>>>>>> understand the metadata refresh delay on the consumer side and use it to
>>>>>>> set the config.
>>>>>>> 
>>>>>>> Now, let's consider the alternative: setting the same value for the
>>>>>>> existing by_duration policy. The KIP lists three issues with this
>>>>>>> approach.
>>>>>>> 1. It computes the seek target client-side as now() - duration, which
>>>>>>> introduces clock skew across consumers and forces operators to choose
>>>>>>> overly large durations, causing unnecessary reprocessing.
>>>>>>> 2. The target timestamp is recomputed on each retry, so failed
>>>>>>> ListOffsetsRequest retries can shift the target forward and potentially
>>>>>>> miss records produced between attempts.
>>>>>>> 3. It applies uniformly to all partitions without committed offsets, and
>>>>>>> cannot distinguish newly expanded partitions from long-existing 
>>>>>>> partitions
>>>>>>> newly assigned to the group, leading to unnecessary replay.
>>>>>>> 
>>>>>>> Issues 1 and 2 are uncommon and can be mitigated by adding a bit buffer 
>>>>>>> to
>>>>>>> the metadata refresh delay. We could also consider improving the
>>>>>>> implementation. For issue 3, the metadata refresh delay is typically low
>>>>>>> (in the order of minutes with the classic consumer and tens of seconds
>>>>>>> with
>>>>>>> the new consumer). If a user is ok with reading that much backlog for 
>>>>>>> new
>>>>>>> partitions, it seems they will be ok doing the same for existing
>>>>>>> partitions.
>>>>>>> 
>>>>>>> So, instead of introducing a new config, could we just reuse the 
>>>>>>> existing
>>>>>>> config with better documentation and/or implementation?
>>>>>>> 
>>>>>>> Jun
>>>>>>> 
>>>>>>> 
>>>>>>>> On Sat, Jun 13, 2026 at 12:19 AM 黃竣陽 <[email protected]> wrote:
>>>>>>> 
>>>>>>>> Hello Jun,
>>>>>>>> 
>>>>>>>> You're right that group creation time is the more intuitive answer at
>>>>>>>> first glance,
>>>>>>>> the KIP's own motivation talks about partitions that "predate the 
>>>>>>>> group"
>>>>>>>> vs partitions
>>>>>>>> "created during group runtime," which directly points to a
>>>>>>> group-lifecycle
>>>>>>>> classifier.
>>>>>>>> I'd like to walk through why we landed on partition age, and the
>>>>>>>> trade-offs we considered.
>>>>>>>> 
>>>>>>>> We evaluated three candidate signals:
>>>>>>>> 
>>>>>>>> 1. `by_duration:5secs`
>>>>>>>> 
>>>>>>>> This covers the metadata blindness window, but has issues the KIP
>>>>>>>> currently documents
>>>>>>>> under "Why not use `by_duration`?":
>>>>>>>> 
>>>>>>>> - Client-side `now() - duration` introduces clock skew across 
>>>>>>>> consumers.
>>>>>>>> - `ListOffsets` retries shift the target forward, potentially missing
>>>>>>>> records produced between
>>>>>>>> attempts.
>>>>>>>> - It applies uniformly to all partitions without committed offsets,
>>>>>>>> including pre-existing partitions
>>>>>>>> newly assigned to the group, causing unnecessary replay.
>>>>>>>> 
>>>>>>>> 2. Group creation time as classifier
>>>>>>>> 
>>>>>>>> This works cleanly when the consumer is actively running. Our concern
>>>>>>>> is the idle / late-rejoin case:
>>>>>>>> 
>>>>>>>> T=0:         Group created.
>>>>>>>> T=1..T=100:  Consumer idle (down, disconnected, etc.).
>>>>>>>> T=50:        Partition added during the idle window.
>>>>>>>> T=100:       Consumer resumes.
>>>>>>>> 
>>>>>>>> Under group creation time, the new partition is classified as new
>>>>>>>> (`50 > 0`) and reset to `earliest`, replaying everything from T=50.
>>>>>>>> But during `[T=1, T=100]`, base partitions also accumulated data that
>>>>>>>> the consumer accepts as lost — that is precisely the contract of
>>>>>>>> `auto.offset.reset=latest`. There is no principled reason to treat
>>>>>>>> the new partition differently; both contain backlog accumulated during
>>>>>>>> the same idle window.
>>>>>>>> 
>>>>>>>> This aligns with the "backlog is backlog” principle you raised in
>>>>>>>> the KIP-1282 thread: a `latest` user has tolerated some backlog on
>>>>>>>> every other partition during the same idle period; forcing 0-backlog
>>>>>>>> tolerance only on new partitions would be inconsistent with that
>>>>>>>> tolerance.
>>>>>>>> 
>>>>>>>> 3. Partition age vs threshold
>>>>>>>> 
>>>>>>>> Partition age corresponds to the actual silent data loss window,
>>>>>>>> the gap between partition creation and the consumer’s metadata
>>>>>>>> refresh. Within this window, data loss is genuinely silent: the
>>>>>>>> consumer had no opportunity to know about the partition. Outside this
>>>>>>>> window, missing data reflects either:
>>>>>>>> 
>>>>>>>> - (a) the user’s tolerated cost of running with idle consumers, or
>>>>>>>> - (b) an operational issue to surface via monitoring, not via reset
>>>>>>> policy.
>>>>>>>> 
>>>>>>>> We did not choose partition age because it is more elegant than group
>>>>>>>> creation time — we chose it because its failure mode (requires a
>>>>>>>> threshold) is
>>>>>>>> less invasive than the failure mode of group creation time (overrides
>>>>>>>> user-stated
>>>>>>>> `latest` intent during idle periods).
>>>>>>>> 
>>>>>>>> Best Regards,
>>>>>>>> Jiunn-Yang
>>>>>>>> 
>>>>>>>>> Chia-Ping Tsai <[email protected]> 於 2026年6月13日 上午11:52 寫道:
>>>>>>>>> 
>>>>>>>>> Hi Jun,
>>>>>>>>> 
>>>>>>>>> Relying on both creation times will create an inconsistent scenario. A
>>>>>>>>> consumer that lost all offsets due to a long sleep will seek to the
>>>>>>>>> beginning for the partitions created later than the group.
>>>>>>>>> 
>>>>>>>>> That is why we initially proposed KIP-1282 to fix the inconsistency
>>>>>>>> using a
>>>>>>>>> whole new policy. Since KIP-1282 couldn't reach a consensus, KIP-1327
>>>>>>>> goes
>>>>>>>>> back to using flexible configurations to prevent users from falling
>>>>>>> into
>>>>>>>>> that pitfall.
>>>>>>>>> 
>>>>>>>>> Best, Chia-Ping
>>>>>>>>> 
>>>>>>>>> Jun Rao via dev <[email protected]> 於 2026年6月13日週六 上午6:49寫道:
>>>>>>>>> 
>>>>>>>>>> Hi, Jiunn-Yang,
>>>>>>>>>> 
>>>>>>>>>> Thanks for the reply and sorry for the late reply.
>>>>>>>>>> 
>>>>>>>>>> JR1. The design of auto.offset.reset.max.age.ms
>>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.max.age.ms__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpwnwknH$>
>>>>>>> still feels weird to
>>>>>>>> me.
>>>>>>>>>> It
>>>>>>>>>> categorizes partitions as new or existing based on the partition
>>>>>>>> creation
>>>>>>>>>> time. Intuitively, the categorization should be based on the group
>>>>>>>> creation
>>>>>>>>>> time: all partitions existing when the group is created are existing
>>>>>>> and
>>>>>>>>>> all partitions created after the group creation are new partitions.
>>>>>>>>>> 
>>>>>>>>>> Jun
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Tue, Jun 9, 2026 at 8:51 AM 黃竣陽 <[email protected]> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hi all,
>>>>>>>>>>> 
>>>>>>>>>>> Manually bumping this thread. If there is no further
>>>>>>>>>>> discussion, I will close the vote.
>>>>>>>>>>> 
>>>>>>>>>>> Best Regards,
>>>>>>>>>>> Jiunn-Yang
>>>>>>>>>>> 
>>>>>>>>>>>> 黃竣陽 <[email protected]> 於 2026年6月1日 晚上7:16 寫道:
>>>>>>>>>>>> 
>>>>>>>>>>>> Hello Jian,
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks for your feedback,
>>>>>>>>>>>> 
>>>>>>>>>>>> Agreed, partition expansion is a common operational task, not an
>>>>>>> edge
>>>>>>>>>>>> case. I've updated the Motivation section accordingly.
>>>>>>>>>>>> 
>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>> Jiunn-Yang
>>>>>>>>>>>> 
>>>>>>>>>>>>> jian fu <[email protected]> 於 2026年6月1日 下午5:49 寫道:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi Jiunn-Yang:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks for the KIP. I think it would be useful to clarify that
>>>>>>> this
>>>>>>>>>> is a
>>>>>>>>>>>>> common scenario rather than an edge case, which further
>>>>>>> demonstrates
>>>>>>>>>> the
>>>>>>>>>>>>> need for this optimization. For example:
>>>>>>>>>>>>> A partition expansion is a common operational task in Kafka: To
>>>>>>>>>> balance
>>>>>>>>>>>>> resource utilization and cost, topics are typically created with a
>>>>>>>>>>> moderate
>>>>>>>>>>>>> default partition count. However, as traffic grows over time, it
>>>>>>> is
>>>>>>>>>>> often
>>>>>>>>>>>>> necessary to increase the number of partitions to accommodate the
>>>>>>>>>> higher
>>>>>>>>>>>>> workload.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Regards
>>>>>>>>>>>>> Jian
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 黃竣陽 <[email protected]> 于2026年5月30日周六 22:31写道:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hello chia,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks for the comments, I have updated the KIP!
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>> Jiunn-Yang
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Chia-Ping Tsai <[email protected]> 於 2026年5月30日 晚上8:29 寫道:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Hi Jiunn-Yang,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Would you mind removing the terms "hot" and "cold" when
>>>>>>> describing
>>>>>>>>>>>>>>> partitions in the KIP? I understand you are using them to
>>>>>>> describe
>>>>>>>>>> the
>>>>>>>>>>>>>>> "freshness" or the users' need for the records, but applying
>>>>>>> these
>>>>>>>>>>> terms
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> the partition itself feels a bit unnatural.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> After all, in this scenario, users don't really care whether a
>>>>>>>>>>> partition
>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>> newly expanded or not. Their only expectation is that they won't
>>>>>>>>>>> silently
>>>>>>>>>>>>>>> lose any live records produced to the topic during their active
>>>>>>>>>>>>>> consumption.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Best, Chia-Ping
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 黃竣陽 <[email protected]> 於 2026年5月30日週六 下午12:30寫道:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Hello Jun,
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Thanks for the feedback, I have updated the KIP motivation
>>>>>>>> section.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>>>> Jiunn-Yang
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Jun Rao via dev <[email protected]> 於 2026年5月30日 凌晨1:12
>>>>>>> 寫道:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Hi, Jiunn-Yang,
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Thanks for the reply. I think we need a stronger motivation
>>>>>>> for
>>>>>>>>>> the
>>>>>>>>>>>>>> KIP.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> The KIP says "The core insight is that not all partitions
>>>>>>> without
>>>>>>>>>> a
>>>>>>>>>>>>>>>>> committed offset are the same. A newly expanded partition
>>>>>>> (hot)
>>>>>>>> is
>>>>>>>>>>>>>>>>> fundamentally different from a partition the consumer has
>>>>>>> never
>>>>>>>>>> seen
>>>>>>>>>>>>>>>>> because it predates the group (cold)." Why is the hot
>>>>>>> partition
>>>>>>>>>>>>>>>>> fundamentally different from the cold?
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> The KIP says "The existing by_duration policy is also
>>>>>>>> insufficient
>>>>>>>>>>>>>>>> because:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> - The calculated seek time (now() - duration) varies across
>>>>>>> nodes
>>>>>>>>>>> due
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> clock skew. To be safe, users must set an overly large
>>>>>>> duration,
>>>>>>>>>>>>>>>> causing
>>>>>>>>>>>>>>>>> unnecessary reprocessing.
>>>>>>>>>>>>>>>>> - On network errors, the client recalculates the seek time on
>>>>>>>>>> retry,
>>>>>>>>>>>>>>>>> shifting the target timestamp forward and risking data loss."
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> However, both of these situations are rare. If these issues
>>>>>>>>>> persist,
>>>>>>>>>>>>>> more
>>>>>>>>>>>>>>>>> severe problems likely exist elsewhere. Rare situations don't
>>>>>>>>>> need a
>>>>>>>>>>>>>>>> common
>>>>>>>>>>>>>>>>> solution. If users care about those rare situations, they can
>>>>>>>>>>> implement
>>>>>>>>>>>>>>>>> customized logic using
>>>>>>>>>>>>>> ConsumerRebalanceListener.onPartitionsAssigned().
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Jun
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On Sun, May 17, 2026 at 6:50 AM 黃竣陽 <[email protected]>
>>>>>>> wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Hello chia,
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Thanks for the feedback,
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> If the creation time exists, the returned value should
>>>>>>> always
>>>>>>>> be
>>>>>>>>>>>>>>>> greater
>>>>>>>>>>>>>>>>>> than or equal to zero, right?
>>>>>>>>>>>>>>>>>> I have explicitly mentioned this in the KIP.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> New  Old (MetadataResponse v0–13)    positive        any
>>>>>>>>>>> field
>>>>>>>>>>>>>>>>>> absent    UnsupportedVersionException
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> The earliest point at which we can detect the version
>>>>>>> mismatch
>>>>>>>> is
>>>>>>>>>>>>>> during
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> first metadata fetch after assignment, which occurs inside
>>>>>>>>>> poll().
>>>>>>>>>>>>>>>>>> Therefore, the
>>>>>>>>>>>>>>>>>> user would encounter an UnsupportedVersionException from
>>>>>>> poll().
>>>>>>>>>>> I’ll
>>>>>>>>>>>>>>>>>> clarify this in the KIP.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>>>>>> Jiunn-Yang
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Chia-Ping Tsai <[email protected]> 於 2026年5月17日 下午4:50
>>>>>>> 寫道:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> hi Jiunn
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> PartitionAgeMs (int64, default -1): The age of this
>>>>>>> partition
>>>>>>>>>> in
>>>>>>>>>>>>>>>>>> milliseconds, computed server-side by the broker as
>>>>>>>>>>>>>> broker_current_time
>>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>>>>> partition_creation_time. Returns -1 if the broker does not
>>>>>>>>>> support
>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>> feature or the partition creation time is unknown.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> If the creation time exists, the returned value should
>>>>>>> always
>>>>>>>> be
>>>>>>>>>>>>>>>> greater
>>>>>>>>>>>>>>>>>> than or equal to zero, right?
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> New  Old (MetadataResponse v0–13)    positive        any
>>>>>>>>>>> field
>>>>>>>>>>>>>>>>>> absent    UnsupportedVersionException
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Will user encounter UnsupportedVersionException when calling
>>>>>>>>>>>>>> `poll()`?
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>> Chia-Ping
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> On 2026/05/16 04:30:49 黃竣陽 wrote:
>>>>>>>>>>>>>>>>>>>> Hello Jun, chia,
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> I've updated KIP-1327 with a design change based on the
>>>>>>>>>>> discussion
>>>>>>>>>>>>>>>>>>>> feedback.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> The updated design decouples the new-partition reset
>>>>>>> behavior
>>>>>>>>>>> from
>>>>>>>>>>>>>>>>>>>> the base auto.offset.reset policy:
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> - auto.offset.reset.max.age.ms
>>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.max.age.ms__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpwnwknH$>
>>>>>>> now applies to all
>>>>>>>>>>> auto.offset.reset
>>>>>>>>>>>>>>>>>> values
>>>>>>>>>>>>>>>>>>>> (latest, earliest, by_duration, none).
>>>>>>>>>>>>>>>>>>>> - For new ("hot") partitions, the consumer resets to
>>>>>>>>>>>>>>>>>> auto.offset.reset.new
>>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$>
>>>>>>> .partitions
>>>>>>>>>>>>>>>>>>>> config setting
>>>>>>>>>>>>>>>>>>>> - For existing ("cold") partitions, the base
>>>>>>> auto.offset.reset
>>>>>>>>>>>>>> policy
>>>>>>>>>>>>>>>>>> continues
>>>>>>>>>>>>>>>>>>>> to apply unchanged.
>>>>>>>>>>>>>>>>>>>> - The new-partition reset behavior is represented by a
>>>>>>>> separate
>>>>>>>>>>>>>>>>>> internal config
>>>>>>>>>>>>>>>>>>>> (auto.offset.reset.new
>>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$>.partitions,
>>>>>>> currently fixed to
>>>>>>>>>> earliest).
>>>>>>>>>>>>>> This
>>>>>>>>>>>>>>>>>> decoupled design makes
>>>>>>>>>>>>>>>>>>>> it straightforward to promote the behavior to a public
>>>>>>>>>>> user-facing
>>>>>>>>>>>>>>>>>> configuration in a future KIP.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>>>>>>>> Jiunn-Yang
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Chia-Ping Tsai <[email protected]> 於 2026年5月16日 清晨7:46
>>>>>>> 寫道:
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> hi Jun
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> I see what you mean now. The proposal from me is listed
>>>>>>>> below:
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 1) Add auto.offset.reset.new
>>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$>.partitions
>>>>>>> with a default value
>>>>>>>>>> of
>>>>>>>>>>>>>>>>>> earliest. It fixes the data loss from both by_duration and
>>>>>>>>>> latest,
>>>>>>>>>>> and
>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>> does not change the logic of auto.offset.reset=earliest.
>>>>>>>>>>>>>>>>>>>>> 2) Mark auto.offset.reset.new
>>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$>.partitions
>>>>>>> as an internal
>>>>>>>>>>>>>>>>>> configuration. auto.offset.reset.new
>>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$>
>>>>>>> .partitions=earliest
>>>>>>>> already
>>>>>>>>>>>>>>>>>> addresses the issue, and we can discuss the use cases of
>>>>>>> other
>>>>>>>>>>> values
>>>>>>>>>>>>>>>> in a
>>>>>>>>>>>>>>>>>> separate KIP.
>>>>>>>>>>>>>>>>>>>>> 3) Both configs, auto.offset.reset.new
>>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$>.partitions
>>>>>>> and
>>>>>>>>>>>>>>>>>> auto.offset.reset.latest.max.age.ms
>>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.latest.max.age.ms__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfu9JSP4l$>,
>>>>>>> will be applied to all for
>>>>>>>>>>>>>>>>>> consistency.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> WDYT?
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> On 2026/05/15 20:53:20 Jun Rao via dev wrote:
>>>>>>>>>>>>>>>>>>>>>> Hi, Chia-Ping,
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Thanks for the reply.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> 1. In the motivation section, the KIP says "When a Kafka
>>>>>>>>>> topic
>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>> expanded
>>>>>>>>>>>>>>>>>>>>>> with new partitions, consumers using the latest auto
>>>>>>> offset
>>>>>>>>>>> reset
>>>>>>>>>>>>>>>>>> policy
>>>>>>>>>>>>>>>>>>>>>> will silently miss all records produced to those
>>>>>>> partitions
>>>>>>>>>>> before
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>> consumer discovers them.". If a user sets
>>>>>>>>>>>>>>>>>>>>>> auto.offset.reset=by_duration=1sec, the same record loss
>>>>>>>>>> issue
>>>>>>>>>>>>>> could
>>>>>>>>>>>>>>>>>> also
>>>>>>>>>>>>>>>>>>>>>> happen, right?
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> 2. I was thinking auto.offset.reset.new
>>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$>.partitions
>>>>>>> will
>>>>>>>> take
>>>>>>>>>>> the
>>>>>>>>>>>>>>>> same
>>>>>>>>>>>>>>>>>>>>>> values as auto.offset.reset. So a user could set it
>>>>>>>>>>> by_duration if
>>>>>>>>>>>>>>>>>> needed.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Jun
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> On Thu, May 14, 2026 at 4:06 PM Chia-Ping Tsai <
>>>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> hi Jun
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Thanks for the feedback. I might be missing something
>>>>>>>>>>> important
>>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>>>> your
>>>>>>>>>>>>>>>>>>>>>>> suggestion, so please bear with me as I try to clarify
>>>>>>> with
>>>>>>>>>> a
>>>>>>>>>>> few
>>>>>>>>>>>>>>>>>> questions:
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 1. Is there a strong use case for extending this logic
>>>>>>> to
>>>>>>>>>>> other
>>>>>>>>>>>>>>>> reset
>>>>>>>>>>>>>>>>>>>>>>> policies? Unlike latest, policies like earliest or
>>>>>>>>>> by_duration
>>>>>>>>>>>>>>>> don't
>>>>>>>>>>>>>>>>>> seem
>>>>>>>>>>>>>>>>>>>>>>> to suffer from the same silent data loss issue when a
>>>>>>>>>>> partition
>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>> expanded.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 2. What values would we expect users to configure for
>>>>>>>>>>>>>>>>>>>>>>> auto.offset.reset.new
>>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$>.partitions?
>>>>>>> If they set it to
>>>>>>>>>> earliest
>>>>>>>>>>> or
>>>>>>>>>>>>>>>>>> latest,
>>>>>>>>>>>>>>>>>>>>>>> we might run into the exact same edge cases. For
>>>>>>> example,
>>>>>>>>>> if a
>>>>>>>>>>>>>>>>>> consumer is
>>>>>>>>>>>>>>>>>>>>>>> offline for a while and a new partition is created
>>>>>>> during
>>>>>>>>>> that
>>>>>>>>>>>>>>>>>> downtime,
>>>>>>>>>>>>>>>>>>>>>>> the user might actually want to skip to latest when
>>>>>>>>>> resuming,
>>>>>>>>>>>>>>>> rather
>>>>>>>>>>>>>>>>>> than
>>>>>>>>>>>>>>>>>>>>>>> reading from earliest just because the partition is
>>>>>>>>>>> technically
>>>>>>>>>>>>>>>>>> "new" to
>>>>>>>>>>>>>>>>>>>>>>> the group.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> This is exactly why we opted for introducing a max.age
>>>>>>>>>>> threshold.
>>>>>>>>>>>>>>>> It
>>>>>>>>>>>>>>>>>> gives
>>>>>>>>>>>>>>>>>>>>>>> users a time-bound way to define what is genuinely
>>>>>>>> "hot/new"
>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> what is
>>>>>>>>>>>>>>>>>>>>>>> just an old partition they haven't seen yet.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>> Chia-Ping
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> On 2026/05/14 20:48:09 Jun Rao via dev wrote:
>>>>>>>>>>>>>>>>>>>>>>>> Hi, Jiunn-Yang,
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the KIP.
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> I find auto.offset.reset.latest.max.age a bit weird. It
>>>>>>>>>> only
>>>>>>>>>>>>>>>>>> applies when
>>>>>>>>>>>>>>>>>>>>>>>> auto.offset.reset is latest. However, it seems that the
>>>>>>>>>>>>>> motivation
>>>>>>>>>>>>>>>>>>>>>>> equally
>>>>>>>>>>>>>>>>>>>>>>>> applies when auto.offset.reset is set to other values
>>>>>>> like
>>>>>>>>>>>>>>>>>> by_duration.
>>>>>>>>>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>>>>>>>>>> intention is that we want to have a separate way to
>>>>>>>> control
>>>>>>>>>>>>>> newly
>>>>>>>>>>>>>>>>>> created
>>>>>>>>>>>>>>>>>>>>>>>> partitions vs existing partitions when the group
>>>>>>> starts.
>>>>>>>>>>> Have we
>>>>>>>>>>>>>>>>>>>>>>> considered
>>>>>>>>>>>>>>>>>>>>>>>> adding a new config like auto.offset.reset.new
>>>>>>> <https://urldefense.com/v3/__http://auto.offset.reset.new__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkfpDMjdw3$>
>>>>>>>> .partitions?
>>>>>>>>>>> If
>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>> new
>>>>>>>>>>>>>>>>>>>>>>>> config is not set, the offset reset policy defaults to
>>>>>>> the
>>>>>>>>>>>>>> policy
>>>>>>>>>>>>>>>>>> used
>>>>>>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>>>>>> existing partitions. The user could set it explicitly
>>>>>>> to
>>>>>>>>>>>>>> customize
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>> behavior for new partitions.
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> Jun
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> On Thu, May 7, 2026 at 5:07 AM 黃竣陽 <[email protected]
>>>>>>>> 
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> I’d like to manually bump this thread.
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>>>>>>>>>>>>> Jiunn-Yang
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> 黃竣陽 <[email protected]> 於 2026年5月1日 晚上10:37 寫道:
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the feedback.
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> DJ01/DJ02:
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> MetadataResponse bumps from v13 to v14. The
>>>>>>>>>>> PartitionMetadata
>>>>>>>>>>>>>>>>>> struct
>>>>>>>>>>>>>>>>>>>>>>>>> gains a new
>>>>>>>>>>>>>>>>>>>>>>>>>> field PartitionAgeMs (int64, default -1), computed
>>>>>>>>>>> server-side
>>>>>>>>>>>>>>>> by
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>> broker as
>>>>>>>>>>>>>>>>>>>>>>>>>> broker_current_time - partition_creation_time.
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> Also add the consumer heartbeat flow. when
>>>>>>>>>>> MembershipManager
>>>>>>>>>>>>>>>>>> detects
>>>>>>>>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>>>>>>>> newly assigned
>>>>>>>>>>>>>>>>>>>>>>>>>> partition, it explicitly invalidates the metadata for
>>>>>>>> the
>>>>>>>>>>>>>>>> affected
>>>>>>>>>>>>>>>>>>>>>>> topic
>>>>>>>>>>>>>>>>>>>>>>>>> and forces a fresh MetadataRequest
>>>>>>>>>>>>>>>>>>>>>>>>>> before making the offset reset decision, even if the
>>>>>>>>>> topic
>>>>>>>>>>> ID
>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>>>>> already
>>>>>>>>>>>>>>>>>>>>>>>>> in the cache.
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> MB0:
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> The consumer learns the broker's maximum supported
>>>>>>>>>>>>>>>>>> MetadataResponse
>>>>>>>>>>>>>>>>>>>>>>>>> version via the
>>>>>>>>>>>>>>>>>>>>>>>>>> ApiVersions negotiation at connection time. If the
>>>>>>>>>>> negotiated
>>>>>>>>>>>>>>>>>>>>>>> version is
>>>>>>>>>>>>>>>>>>>>>>>>> unsupported, the consumer
>>>>>>>>>>>>>>>>>>>>>>>>>> knows the broker does not support PartitionAgeMs at
>>>>>>> all
>>>>>>>>>> and
>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>>>>>>>>> throw an
>>>>>>>>>>>>>>>>>>>>>>>>> UnsupportedVersionException
>>>>>>>>>>>>>>>>>>>>>>>>>> immediately, rather than silently falling back to
>>>>>>> latest
>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> risking
>>>>>>>>>>>>>>>>>>>>>>>>> data loss without any operator-visible signal.
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> MB1/MB2/MB3:
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> I have addressed these changes in the KIP.
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>>>>>>>>>>>>>> Jiunn-Yang
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> Chia-Ping Tsai <[email protected]> 於 2026年4月29日
>>>>>>>> 下午4:04
>>>>>>>>>>> 寫道:
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> hi David
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> I agree with the direction of moving the 'age'
>>>>>>>>>> resolution
>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>> Heartbeat API to the Metadata API to keep the control
>>>>>>>>>> plane
>>>>>>>>>>>>>>>> clean.
>>>>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>>>>>>>>> main
>>>>>>>>>>>>>>>>>>>>>>>>> trade-off, as we noted before, is introducing
>>>>>>>> inter-broker
>>>>>>>>>>>>>> clock
>>>>>>>>>>>>>>>>>> skew.
>>>>>>>>>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>>>>>>>>>>> Group Coordinator approach provided a single source of
>>>>>>>>>> truth
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>> time.
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> However, realistically, this time skew should be
>>>>>>>>>>> negligible.
>>>>>>>>>>>>>>>>>> Given
>>>>>>>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>>>>>>>> the max.age threshold will likely be configured in
>>>>>>>> minutes
>>>>>>>>>>> or
>>>>>>>>>>>>>>>>>> hours, a
>>>>>>>>>>>>>>>>>>>>>>>>> typical NTP skew (in milliseconds) between brokers
>>>>>>> won't
>>>>>>>>>>> impact
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>> fallback decision.
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>> Chia-Ping
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> David Jacot via dev <[email protected]> 於
>>>>>>>>>> 2026年4月29日
>>>>>>>>>>>>>>>> 下午3:29
>>>>>>>>>>>>>>>>>> 寫道:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the KIP!
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sorry, I haven't really followed the previous
>>>>>>>>>>> conversation
>>>>>>>>>>>>>>>> but I
>>>>>>>>>>>>>>>>>>>>>>> took a
>>>>>>>>>>>>>>>>>>>>>>>>>>>> quick look at this one.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> DJ01: I don't clearly understand the flow with the
>>>>>>>>>>>>>>>>>>>>>>>>> ConsumerGroupHeartbeat
>>>>>>>>>>>>>>>>>>>>>>>>>>>> API after reading the KIP. There is a new boolean;
>>>>>>> the
>>>>>>>>>>> KIP
>>>>>>>>>>>>>>>>>> states
>>>>>>>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>>>>>>>>>>> partition ages are returned only when this boolean
>>>>>>> is
>>>>>>>>>>> set.
>>>>>>>>>>>>>>>>>>>>>>> Implicitly,
>>>>>>>>>>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>>>>>>>>>>>> means that when the consumer receives a new
>>>>>>> partition,
>>>>>>>>>> it
>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>>>>>>> issue a
>>>>>>>>>>>>>>>>>>>>>>>>> new
>>>>>>>>>>>>>>>>>>>>>>>>>>>> HB request with the boolean set to receive the
>>>>>>> ages.
>>>>>>>> Is
>>>>>>>>>>> my
>>>>>>>>>>>>>>>>>>>>>>>>> understanding
>>>>>>>>>>>>>>>>>>>>>>>>>>>> correct? We should perhaps clarify the flow and
>>>>>>> also
>>>>>>>>>>> explain
>>>>>>>>>>>>>>>>>> how it
>>>>>>>>>>>>>>>>>>>>>>>>> fits
>>>>>>>>>>>>>>>>>>>>>>>>>>>> into the existing flow (e.g. list offsets, fetch
>>>>>>>>>> offsets,
>>>>>>>>>>>>>>>> etc.).
>>>>>>>>>>>>>>>>>>>>>>>>>>>> DJ02: It my understanding is correct, I wonder if
>>>>>>>>>>>>>>>>>>>>>>>>>>>> the ConsumerGroupHeartbeat API is the right place
>>>>>>> for
>>>>>>>>>>> this
>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>>>>>>>> a new
>>>>>>>>>>>>>>>>>>>>>>>>>>>> round trip is done anyway. Alternatively, it could
>>>>>>>>>> simply
>>>>>>>>>>>>>>>>>> include
>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>> metadata. Generally, we should be rather cautious
>>>>>>>> about
>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>>>>>> overloading
>>>>>>>>>>>>>>>>>>>>>>>>>>>> the ConsumerGroupHeartbeat API with unrelated
>>>>>>>> concepts.
>>>>>>>>>>> The
>>>>>>>>>>>>>>>> API
>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>>>>>>>>>>> control plane API for assigning or revoking
>>>>>>>> partitions.
>>>>>>>>>>> The
>>>>>>>>>>>>>>>> fact
>>>>>>>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>>>>>>>>>>>>> don't want to add it to the corresponding Streams
>>>>>>> API
>>>>>>>>>>> also
>>>>>>>>>>>>>>>>>> suggests
>>>>>>>>>>>>>>>>>>>>>>>>>>>> something is not quite right. What would we do if
>>>>>>> we
>>>>>>>>>>> want to
>>>>>>>>>>>>>>>>>>>>>>> support
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Streams in the future?
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Apr 29, 2026 at 12:28 AM Muralidhar Basani
>>>>>>>> via
>>>>>>>>>>> dev
>>>>>>>>>>>>>> <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jiunn,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you for this great kip. Good to know about
>>>>>>> the
>>>>>>>>>>> gap.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> mb-0 - why a new v2 version bump for
>>>>>>>>>>> RequestPartitionAges
>>>>>>>>>>>>>>>>>> field.
>>>>>>>>>>>>>>>>>>>>>>> Can a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> tagged field (for ex: on response, PartitionAges
>>>>>>> on
>>>>>>>>>>>>>>>>>>>>>>> TopicPartitions)
>>>>>>>>>>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> used here and avoid version bump?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> mb-1 - For the new config, is there a recommended
>>>>>>>>>> value
>>>>>>>>>>> or
>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>>>>>> ConfigDef
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> validator? Probably it should based on the
>>>>>>>>>>>>>>>> metadata.max.age.ms
>>>>>>> <https://urldefense.com/v3/__http://metadata.max.age.ms__;!!Ayb5sqE7!ryUSIElKDF-DJJHgYwYXwp4XEBXpXuBOnZd18PJoMNH4LZ1gc-pDbbdfb2eme_dRSvdvI3bkflKEb5SK$>
>>>>>>>>>>>>>>>>>> ?
>>>>>>>>>>>>>>>>>>>>>>>>> Sizing
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> instructions can be part of javadocs I guess.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> mb-2 - (minor) As there are no changes to Kafka
>>>>>>>>>> Streams,
>>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>>>>>>> better
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to add this new config
>>>>>>>>>> auto.offset.reset.latest.max.age
>>>>>>>>>>> to
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> StreamsConfig block list
>>>>>>>>>>>>>>>>>>>>>>> (NON_CONFIGURABLE_CONSUMER_DEFAULT_CONFIGS)
>>>>>>>>>>>>>>>>>>>>>>>>> for a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> clear warning, incase users configure it? This is
>>>>>>> the
>>>>>>>>>>> most
>>>>>>>>>>>>>>>>>>>>>>> familiar
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> consumer config and users might easily mistakenly
>>>>>>>>>>> configure
>>>>>>>>>>>>>>>>>> it. Or
>>>>>>>>>>>>>>>>>>>>>>>>> may be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it's not worth it to add.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> mb-3 - (minor) The phrasing "the consumer falls
>>>>>>> back
>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> earliest"
>>>>>>>>>>>>>>>>>>>>>>>>> reads as
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> if the config were being changed per-partition
>>>>>>> which
>>>>>>>>>>> isn't
>>>>>>>>>>>>>>>>>>>>>>> supported.
>>>>>>>>>>>>>>>>>>>>>>>>> May
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> be rephrasing to something like "consumer resolves
>>>>>>>> the
>>>>>>>>>>>>>>>> initial
>>>>>>>>>>>>>>>>>>>>>>>>> position to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> start offset for that partition" as if earliest
>>>>>>> was
>>>>>>>>>>> applied
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> partition only and auto.offset.reset config is
>>>>>>>>>>> unchanged.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Murali
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Apr 28, 2026 at 2:48 PM 黃竣陽 <
>>>>>>>>>>> [email protected]>
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi chia,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I have updated the KIP to include this change.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiunn-Yang
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Chia-Ping Tsai <[email protected]> 於
>>>>>>> 2026年4月28日
>>>>>>>>>>> 晚上8:03
>>>>>>>>>>>>>>>> 寫道:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> hi Jiunn-Yang
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> chia_0: Should we expose the partition creation
>>>>>>>> time
>>>>>>>>>>> via
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> Admin
>>>>>>>>>>>>>>>>>>>>>>>>> API?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I assume it would be valuable for users to
>>>>>>> diagnose
>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>> troubleshoot
>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> behavior of auto.offset.reset.latest.max.age
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Chia-Ping
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 2026/04/28 10:47:58 黃竣陽 wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hello everyone,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would like to start a discussion on KIP-1327
>>>>>>>>>>> Prevent
>>>>>>>>>>>>>> Hot
>>>>>>>>>>>>>>>>>> Data
>>>>>>>>>>>>>>>>>>>>>>>>> Loss
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Partition Expansion for Latest Policy
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/x/KY4mGQ__;!!Ayb5sqE7!qF4q1QzF1RRgP61D7A2xuEai1ky7fepKDKFFvpNBuePikH-ULmT87TvuuZzy5kau5E4y5zMZAmfQQiwZomM$
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This proposal aims to introduces
>>>>>>>>>>>>>>>>>>>>>>> auto.offset.reset.latest.max.age,
>>>>>>>>>>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> consumer config that lets the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> latest reset policy distinguish newly expanded
>>>>>>>>>> (hot)
>>>>>>>>>>>>>>>>>> partitions
>>>>>>>>>>>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> long-existing (cold) ones. Partitions
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> younger than the configured threshold
>>>>>>>> automatically
>>>>>>>>>>> fall
>>>>>>>>>>>>>>>>>> back
>>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> earliest, preventing silent data loss
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> during topic expansion without forcing a full
>>>>>>>>>>> historical
>>>>>>>>>>>>>>>>>>>>>>> reprocess.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiunn-Yang
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>> 
>>> 
>>> 
> 


Reply via email to