Hello chia, 

Thanks for the comments, I have updated the KIP!

Best Regards,
Jiunn-Yang

> Chia-Ping Tsai <[email protected]> 於 2026年5月30日 晚上8:29 寫道:
> 
> Hi Jiunn-Yang,
> 
> Would you mind removing the terms "hot" and "cold" when describing
> partitions in the KIP? I understand you are using them to describe the
> "freshness" or the users' need for the records, but applying these terms to
> the partition itself feels a bit unnatural.
> 
> After all, in this scenario, users don't really care whether a partition is
> newly expanded or not. Their only expectation is that they won't silently
> lose any live records produced to the topic during their active consumption.
> 
> Best, Chia-Ping
> 
> 
> 
> 黃竣陽 <[email protected]> 於 2026年5月30日週六 下午12:30寫道:
> 
>> Hello Jun,
>> 
>> Thanks for the feedback, I have updated the KIP motivation section.
>> 
>> Best Regards,
>> Jiunn-Yang
>> 
>>> Jun Rao via dev <[email protected]> 於 2026年5月30日 凌晨1:12 寫道:
>>> 
>>> Hi, Jiunn-Yang,
>>> 
>>> Thanks for the reply. I think we need a stronger motivation for the KIP.
>>> 
>>> The KIP says "The core insight is that not all partitions without a
>>> committed offset are the same. A newly expanded partition (hot) is
>>> fundamentally different from a partition the consumer has never seen
>>> because it predates the group (cold)." Why is the hot partition
>>> fundamentally different from the cold?
>>> 
>>> The KIP says "The existing by_duration policy is also insufficient
>> because:
>>> 
>>>  - The calculated seek time (now() - duration) varies across nodes due
>> to
>>>  clock skew. To be safe, users must set an overly large duration,
>> causing
>>>  unnecessary reprocessing.
>>>  - On network errors, the client recalculates the seek time on retry,
>>>  shifting the target timestamp forward and risking data loss."
>>> 
>>> However, both of these situations are rare. If these issues persist, more
>>> severe problems likely exist elsewhere. Rare situations don't need a
>> common
>>> solution. If users care about those rare situations, they can implement
>>> customized logic using ConsumerRebalanceListener.onPartitionsAssigned().
>>> 
>>> Jun
>>> 
>>> 
>>> On Sun, May 17, 2026 at 6:50 AM 黃竣陽 <[email protected]> wrote:
>>> 
>>>> Hello chia,
>>>> 
>>>> Thanks for the feedback,
>>>> 
>>>>> If the creation time exists, the returned value should always be
>> greater
>>>> than or equal to zero, right?
>>>> I have explicitly mentioned this in the KIP.
>>>> 
>>>>>> New  Old (MetadataResponse v0–13)    positive        any     field
>>>> absent    UnsupportedVersionException
>>>> 
>>>> The earliest point at which we can detect the version mismatch is during
>>>> the
>>>> first metadata fetch after assignment, which occurs inside poll().
>>>> Therefore, the
>>>> user would encounter an UnsupportedVersionException from poll(). I’ll
>>>> clarify this in the KIP.
>>>> 
>>>> Best Regards,
>>>> Jiunn-Yang
>>>> 
>>>>> Chia-Ping Tsai <[email protected]> 於 2026年5月17日 下午4:50 寫道:
>>>>> 
>>>>> hi Jiunn
>>>>> 
>>>>>> PartitionAgeMs (int64, default -1): The age of this partition in
>>>> milliseconds, computed server-side by the broker as broker_current_time
>> -
>>>> partition_creation_time. Returns -1 if the broker does not support this
>>>> feature or the partition creation time is unknown.
>>>>> 
>>>>> If the creation time exists, the returned value should always be
>> greater
>>>> than or equal to zero, right?
>>>>> 
>>>>>> New  Old (MetadataResponse v0–13)    positive        any     field
>>>> absent    UnsupportedVersionException
>>>>> 
>>>>> Will user encounter UnsupportedVersionException when calling `poll()`?
>>>>> 
>>>>> Best,
>>>>> Chia-Ping
>>>>> 
>>>>> 
>>>>> On 2026/05/16 04:30:49 黃竣陽 wrote:
>>>>>> Hello Jun, chia,
>>>>>> 
>>>>>> I've updated KIP-1327 with a design change based on the discussion
>>>>>> feedback.
>>>>>> 
>>>>>> The updated design decouples the new-partition reset behavior from
>>>>>> the base auto.offset.reset policy:
>>>>>> 
>>>>>> - auto.offset.reset.max.age.ms now applies to all auto.offset.reset
>>>> values
>>>>>> (latest, earliest, by_duration, none).
>>>>>> - For new ("hot") partitions, the consumer resets to
>>>> auto.offset.reset.new.partitions
>>>>>> config setting
>>>>>> - For existing ("cold") partitions, the base auto.offset.reset policy
>>>> continues
>>>>>> to apply unchanged.
>>>>>> - The new-partition reset behavior is represented by a separate
>>>> internal config
>>>>>> (auto.offset.reset.new.partitions, currently fixed to earliest). This
>>>> decoupled design makes
>>>>>> it straightforward to promote the behavior to a public user-facing
>>>> configuration in a future KIP.
>>>>>> 
>>>>>> Best Regards,
>>>>>> Jiunn-Yang
>>>>>> 
>>>>>> 
>>>>>>> Chia-Ping Tsai <[email protected]> 於 2026年5月16日 清晨7:46 寫道:
>>>>>>> 
>>>>>>> hi Jun
>>>>>>> 
>>>>>>> I see what you mean now. The proposal from me is listed below:
>>>>>>> 
>>>>>>> 1) Add auto.offset.reset.new.partitions with a default value of
>>>> earliest. It fixes the data loss from both by_duration and latest, and
>> it
>>>> does not change the logic of auto.offset.reset=earliest.
>>>>>>> 2) Mark auto.offset.reset.new.partitions as an internal
>>>> configuration. auto.offset.reset.new.partitions=earliest already
>>>> addresses the issue, and we can discuss the use cases of other values
>> in a
>>>> separate KIP.
>>>>>>> 3) Both configs, auto.offset.reset.new.partitions and
>>>> auto.offset.reset.latest.max.age.ms, will be applied to all for
>>>> consistency.
>>>>>>> 
>>>>>>> WDYT?
>>>>>>> 
>>>>>>> On 2026/05/15 20:53:20 Jun Rao via dev wrote:
>>>>>>>> Hi, Chia-Ping,
>>>>>>>> 
>>>>>>>> Thanks for the reply.
>>>>>>>> 
>>>>>>>> 1. In the motivation section, the KIP says "When a Kafka topic is
>>>> expanded
>>>>>>>> with new partitions, consumers using the latest auto offset reset
>>>> policy
>>>>>>>> will silently miss all records produced to those partitions before
>> the
>>>>>>>> consumer discovers them.". If a user sets
>>>>>>>> auto.offset.reset=by_duration=1sec, the same record loss issue could
>>>> also
>>>>>>>> happen, right?
>>>>>>>> 
>>>>>>>> 2. I was thinking auto.offset.reset.new.partitions will take the
>> same
>>>>>>>> values as auto.offset.reset. So a user could set it by_duration if
>>>> needed.
>>>>>>>> 
>>>>>>>> Jun
>>>>>>>> 
>>>>>>>> On Thu, May 14, 2026 at 4:06 PM Chia-Ping Tsai <[email protected]
>>> 
>>>> wrote:
>>>>>>>> 
>>>>>>>>> hi Jun
>>>>>>>>> 
>>>>>>>>> Thanks for the feedback. I might be missing something important
>> from
>>>> your
>>>>>>>>> suggestion, so please bear with me as I try to clarify with a few
>>>> questions:
>>>>>>>>> 
>>>>>>>>> 1. Is there a strong use case for extending this logic to other
>> reset
>>>>>>>>> policies? Unlike latest, policies like earliest or by_duration
>> don't
>>>> seem
>>>>>>>>> to suffer from the same silent data loss issue when a partition is
>>>> expanded.
>>>>>>>>> 
>>>>>>>>> 2. What values would we expect users to configure for
>>>>>>>>> auto.offset.reset.new.partitions? If they set it to earliest or
>>>> latest,
>>>>>>>>> we might run into the exact same edge cases. For example, if a
>>>> consumer is
>>>>>>>>> offline for a while and a new partition is created during that
>>>> downtime,
>>>>>>>>> the user might actually want to skip to latest when resuming,
>> rather
>>>> than
>>>>>>>>> reading from earliest just because the partition is technically
>>>> "new" to
>>>>>>>>> the group.
>>>>>>>>> 
>>>>>>>>> This is exactly why we opted for introducing a max.age threshold.
>> It
>>>> gives
>>>>>>>>> users a time-bound way to define what is genuinely "hot/new" and
>>>> what is
>>>>>>>>> just an old partition they haven't seen yet.
>>>>>>>>> 
>>>>>>>>> Best,
>>>>>>>>> Chia-Ping
>>>>>>>>> 
>>>>>>>>> On 2026/05/14 20:48:09 Jun Rao via dev wrote:
>>>>>>>>>> Hi, Jiunn-Yang,
>>>>>>>>>> 
>>>>>>>>>> Thanks for the KIP.
>>>>>>>>>> 
>>>>>>>>>> I find auto.offset.reset.latest.max.age a bit weird. It only
>>>> applies when
>>>>>>>>>> auto.offset.reset is latest. However, it seems that the motivation
>>>>>>>>> equally
>>>>>>>>>> applies when auto.offset.reset is set to other values like
>>>> by_duration.
>>>>>>>>> The
>>>>>>>>>> intention is that we want to have a separate way to control newly
>>>> created
>>>>>>>>>> partitions vs existing partitions when the group starts. Have we
>>>>>>>>> considered
>>>>>>>>>> adding a new config like auto.offset.reset.new.partitions? If
>> this
>>>> new
>>>>>>>>>> config is not set, the offset reset policy defaults to the policy
>>>> used
>>>>>>>>> for
>>>>>>>>>> existing partitions. The user could set it explicitly to customize
>>>> the
>>>>>>>>>> behavior for new partitions.
>>>>>>>>>> 
>>>>>>>>>> Jun
>>>>>>>>>> 
>>>>>>>>>> On Thu, May 7, 2026 at 5:07 AM 黃竣陽 <[email protected]> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hi all,
>>>>>>>>>>> 
>>>>>>>>>>> I’d like to manually bump this thread.
>>>>>>>>>>> 
>>>>>>>>>>> Best Regards,
>>>>>>>>>>> Jiunn-Yang
>>>>>>>>>>> 
>>>>>>>>>>>> 黃竣陽 <[email protected]> 於 2026年5月1日 晚上10:37 寫道:
>>>>>>>>>>>> 
>>>>>>>>>>>> Hello all,
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks for the feedback.
>>>>>>>>>>>> 
>>>>>>>>>>>> DJ01/DJ02:
>>>>>>>>>>>> 
>>>>>>>>>>>> MetadataResponse bumps from v13 to v14. The PartitionMetadata
>>>> struct
>>>>>>>>>>> gains a new
>>>>>>>>>>>> field PartitionAgeMs (int64, default -1), computed server-side
>> by
>>>> the
>>>>>>>>>>> broker as
>>>>>>>>>>>> broker_current_time - partition_creation_time.
>>>>>>>>>>>> 
>>>>>>>>>>>> Also add the consumer heartbeat flow. when MembershipManager
>>>> detects
>>>>>>>>> a
>>>>>>>>>>> newly assigned
>>>>>>>>>>>> partition, it explicitly invalidates the metadata for the
>> affected
>>>>>>>>> topic
>>>>>>>>>>> and forces a fresh MetadataRequest
>>>>>>>>>>>> before making the offset reset decision, even if the topic ID is
>>>>>>>>> already
>>>>>>>>>>> in the cache.
>>>>>>>>>>>> 
>>>>>>>>>>>> MB0:
>>>>>>>>>>>> 
>>>>>>>>>>>> The consumer learns the broker's maximum supported
>>>> MetadataResponse
>>>>>>>>>>> version via the
>>>>>>>>>>>> ApiVersions negotiation at connection time. If the negotiated
>>>>>>>>> version is
>>>>>>>>>>> unsupported, the consumer
>>>>>>>>>>>> knows the broker does not support PartitionAgeMs at all and can
>>>>>>>>> throw an
>>>>>>>>>>> UnsupportedVersionException
>>>>>>>>>>>> immediately, rather than silently falling back to latest and
>>>> risking
>>>>>>>>>>> data loss without any operator-visible signal.
>>>>>>>>>>>> 
>>>>>>>>>>>> MB1/MB2/MB3:
>>>>>>>>>>>> 
>>>>>>>>>>>> I have addressed these changes in the KIP.
>>>>>>>>>>>> 
>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>> Jiunn-Yang
>>>>>>>>>>>> 
>>>>>>>>>>>>> Chia-Ping Tsai <[email protected]> 於 2026年4月29日 下午4:04 寫道:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> hi David
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I agree with the direction of moving the 'age' resolution from
>>>> the
>>>>>>>>>>> Heartbeat API to the Metadata API to keep the control plane
>> clean.
>>>> The
>>>>>>>>> main
>>>>>>>>>>> trade-off, as we noted before, is introducing inter-broker clock
>>>> skew.
>>>>>>>>> The
>>>>>>>>>>> Group Coordinator approach provided a single source of truth for
>>>> time.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> However, realistically, this time skew should be negligible.
>>>> Given
>>>>>>>>> that
>>>>>>>>>>> the max.age threshold will likely be configured in minutes or
>>>> hours, a
>>>>>>>>>>> typical NTP skew (in milliseconds) between brokers won't impact
>> the
>>>>>>>>>>> fallback decision.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Chia-Ping
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> David Jacot via dev <[email protected]> 於 2026年4月29日
>> 下午3:29
>>>> 寫道:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks for the KIP!
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Sorry, I haven't really followed the previous conversation
>> but I
>>>>>>>>> took a
>>>>>>>>>>>>>> quick look at this one.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> DJ01: I don't clearly understand the flow with the
>>>>>>>>>>> ConsumerGroupHeartbeat
>>>>>>>>>>>>>> API after reading the KIP. There is a new boolean; the KIP
>>>> states
>>>>>>>>> that
>>>>>>>>>>>>>> partition ages are returned only when this boolean is set.
>>>>>>>>> Implicitly,
>>>>>>>>>>> this
>>>>>>>>>>>>>> means that when the consumer receives a new partition, it will
>>>>>>>>> issue a
>>>>>>>>>>> new
>>>>>>>>>>>>>> HB request with the boolean set to receive the ages. Is my
>>>>>>>>>>> understanding
>>>>>>>>>>>>>> correct? We should perhaps clarify the flow and also explain
>>>> how it
>>>>>>>>>>> fits
>>>>>>>>>>>>>> into the existing flow (e.g. list offsets, fetch offsets,
>> etc.).
>>>>>>>>>>>>>> DJ02: It my understanding is correct, I wonder if
>>>>>>>>>>>>>> the ConsumerGroupHeartbeat API is the right place for this
>> given
>>>>>>>>> that
>>>>>>>>>>> a new
>>>>>>>>>>>>>> round trip is done anyway. Alternatively, it could simply
>>>> include
>>>>>>>>> the
>>>>>>>>>>>>>> metadata. Generally, we should be rather cautious about not
>>>>>>>>> overloading
>>>>>>>>>>>>>> the ConsumerGroupHeartbeat API with unrelated concepts. The
>> API
>>>> is
>>>>>>>>> a
>>>>>>>>>>>>>> control plane API for assigning or revoking partitions. The
>> fact
>>>>>>>>> that
>>>>>>>>>>> we
>>>>>>>>>>>>>> don't want to add it to the corresponding Streams API also
>>>> suggests
>>>>>>>>>>>>>> something is not quite right. What would we do if we want to
>>>>>>>>> support
>>>>>>>>>>>>>> Streams in the future?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>> David
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Wed, Apr 29, 2026 at 12:28 AM Muralidhar Basani via dev <
>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Hi Jiunn,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thank you for this great kip. Good to know about the gap.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> mb-0 - why a new v2 version bump for RequestPartitionAges
>>>> field.
>>>>>>>>> Can a
>>>>>>>>>>>>>>> tagged field (for ex: on response, PartitionAges on
>>>>>>>>> TopicPartitions)
>>>>>>>>>>> be
>>>>>>>>>>>>>>> used here and avoid version bump?
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> mb-1 - For the new config, is there a recommended value or a
>>>>>>>>> ConfigDef
>>>>>>>>>>>>>>> validator? Probably it should based on the
>> metadata.max.age.ms
>>>> ?
>>>>>>>>>>> Sizing
>>>>>>>>>>>>>>> instructions can be part of javadocs I guess.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> mb-2 - (minor) As there are no changes to Kafka Streams,
>> would
>>>> it
>>>>>>>>> be
>>>>>>>>>>> better
>>>>>>>>>>>>>>> to add this new config auto.offset.reset.latest.max.age to
>> the
>>>>>>>>>>>>>>> StreamsConfig block list
>>>>>>>>> (NON_CONFIGURABLE_CONSUMER_DEFAULT_CONFIGS)
>>>>>>>>>>> for a
>>>>>>>>>>>>>>> clear warning, incase users configure it? This is the most
>>>>>>>>> familiar
>>>>>>>>>>>>>>> consumer config and users might easily mistakenly configure
>>>> it. Or
>>>>>>>>>>> may be
>>>>>>>>>>>>>>> it's not worth it to add.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> mb-3 - (minor) The phrasing "the consumer falls back to
>>>> earliest"
>>>>>>>>>>> reads as
>>>>>>>>>>>>>>> if the config were being changed per-partition which isn't
>>>>>>>>> supported.
>>>>>>>>>>> May
>>>>>>>>>>>>>>> be rephrasing to something like "consumer resolves the
>> initial
>>>>>>>>>>> position to
>>>>>>>>>>>>>>> start offset for that partition" as if earliest was applied
>> to
>>>>>>>>> that
>>>>>>>>>>>>>>> partition only and auto.offset.reset config is unchanged.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Murali
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Tue, Apr 28, 2026 at 2:48 PM 黃竣陽 <[email protected]>
>>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Hi chia,
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> I have updated the KIP to include this change.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>>>> Jiunn-Yang
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Chia-Ping Tsai <[email protected]> 於 2026年4月28日 晚上8:03
>> 寫道:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> hi Jiunn-Yang
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> chia_0: Should we expose the partition creation time via
>> the
>>>>>>>>> Admin
>>>>>>>>>>> API?
>>>>>>>>>>>>>>>> I assume it would be valuable for users to diagnose and
>>>>>>>>> troubleshoot
>>>>>>>>>>> the
>>>>>>>>>>>>>>>> behavior of auto.offset.reset.latest.max.age
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>> Chia-Ping
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On 2026/04/28 10:47:58 黃竣陽 wrote:
>>>>>>>>>>>>>>>>>> Hello everyone,
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> I would like to start a discussion on KIP-1327 Prevent Hot
>>>> Data
>>>>>>>>>>> Loss
>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>> Partition Expansion for Latest Policy
>>>>>>>>>>>>>>>>>> <
>>>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>> 
>> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/x/KY4mGQ__;!!Ayb5sqE7!qF4q1QzF1RRgP61D7A2xuEai1ky7fepKDKFFvpNBuePikH-ULmT87TvuuZzy5kau5E4y5zMZAmfQQiwZomM$
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> This proposal aims to introduces
>>>>>>>>> auto.offset.reset.latest.max.age,
>>>>>>>>>>> a
>>>>>>>>>>>>>>>> consumer config that lets the
>>>>>>>>>>>>>>>>>> latest reset policy distinguish newly expanded (hot)
>>>> partitions
>>>>>>>>>>> from
>>>>>>>>>>>>>>>> long-existing (cold) ones. Partitions
>>>>>>>>>>>>>>>>>> younger than the configured threshold automatically fall
>>>> back
>>>>>>>>> to
>>>>>>>>>>>>>>>> earliest, preventing silent data loss
>>>>>>>>>>>>>>>>>> during topic expansion without forcing a full historical
>>>>>>>>> reprocess.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>> Jiunn-Yang
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
>> 

Reply via email to