hi Jun

I see what you mean now. The proposal from me is listed below:

1) Add auto.offset.reset.new.partitions with a default value of earliest. It 
fixes the data loss from both by_duration and latest, and it does not change 
the logic of auto.offset.reset=earliest.
2) Mark auto.offset.reset.new.partitions as an internal configuration. 
auto.offset.reset.new.partitions=earliest already addresses the issue, and we 
can discuss the use cases of other values in a separate KIP.
3) Both configs, auto.offset.reset.new.partitions and 
auto.offset.reset.latest.max.age.ms, will be applied to all for consistency.

WDYT?

On 2026/05/15 20:53:20 Jun Rao via dev wrote:
> Hi, Chia-Ping,
> 
> Thanks for the reply.
> 
> 1. In the motivation section, the KIP says "When a Kafka topic is expanded
> with new partitions, consumers using the latest auto offset reset policy
> will silently miss all records produced to those partitions before the
> consumer discovers them.". If a user sets
> auto.offset.reset=by_duration=1sec, the same record loss issue could also
> happen, right?
> 
> 2. I was thinking auto.offset.reset.new.partitions will take the same
> values as auto.offset.reset. So a user could set it by_duration if needed.
> 
> Jun
> 
> On Thu, May 14, 2026 at 4:06 PM Chia-Ping Tsai <[email protected]> wrote:
> 
> > hi Jun
> >
> > Thanks for the feedback. I might be missing something important from your
> > suggestion, so please bear with me as I try to clarify with a few questions:
> >
> > 1. Is there a strong use case for extending this logic to other reset
> > policies? Unlike latest, policies like earliest or by_duration don't seem
> > to suffer from the same silent data loss issue when a partition is expanded.
> >
> > 2. What values would we expect users to configure for
> > auto.offset.reset.new.partitions? If they set it to earliest or latest,
> > we might run into the exact same edge cases. For example, if a consumer is
> > offline for a while and a new partition is created during that downtime,
> > the user might actually want to skip to latest when resuming, rather than
> > reading from earliest just because the partition is technically "new" to
> > the group.
> >
> > This is exactly why we opted for introducing a max.age threshold. It gives
> > users a time-bound way to define what is genuinely "hot/new" and what is
> > just an old partition they haven't seen yet.
> >
> > Best,
> > Chia-Ping
> >
> > On 2026/05/14 20:48:09 Jun Rao via dev wrote:
> > > Hi, Jiunn-Yang,
> > >
> > > Thanks for the KIP.
> > >
> > > I find auto.offset.reset.latest.max.age a bit weird. It only applies when
> > > auto.offset.reset is latest. However, it seems that the motivation
> > equally
> > > applies when auto.offset.reset is set to other values like by_duration.
> > The
> > > intention is that we want to have a separate way to control newly created
> > > partitions vs existing partitions when the group starts. Have we
> > considered
> > > adding a new config like auto.offset.reset.new.partitions? If this new
> > > config is not set, the offset reset policy defaults to the policy used
> > for
> > > existing partitions. The user could set it explicitly to customize the
> > > behavior for new partitions.
> > >
> > > Jun
> > >
> > > On Thu, May 7, 2026 at 5:07 AM 黃竣陽 <[email protected]> wrote:
> > >
> > > > Hi all,
> > > >
> > > > I’d like to manually bump this thread.
> > > >
> > > > Best Regards,
> > > > Jiunn-Yang
> > > >
> > > > > 黃竣陽 <[email protected]> 於 2026年5月1日 晚上10:37 寫道:
> > > > >
> > > > > Hello all,
> > > > >
> > > > > Thanks for the feedback.
> > > > >
> > > > > DJ01/DJ02:
> > > > >
> > > > > MetadataResponse bumps from v13 to v14. The PartitionMetadata struct
> > > > gains a new
> > > > > field PartitionAgeMs (int64, default -1), computed server-side by the
> > > > broker as
> > > > > broker_current_time - partition_creation_time.
> > > > >
> > > > > Also add the consumer heartbeat flow. when MembershipManager detects
> > a
> > > > newly assigned
> > > > > partition, it explicitly invalidates the metadata for the affected
> > topic
> > > > and forces a fresh MetadataRequest
> > > > > before making the offset reset decision, even if the topic ID is
> > already
> > > > in the cache.
> > > > >
> > > > > MB0:
> > > > >
> > > > > The consumer learns the broker's maximum supported MetadataResponse
> > > > version via the
> > > > > ApiVersions negotiation at connection time. If the negotiated
> > version is
> > > > unsupported, the consumer
> > > > > knows the broker does not support PartitionAgeMs at all and can
> > throw an
> > > > UnsupportedVersionException
> > > > > immediately, rather than silently falling back to latest and risking
> > > > data loss without any operator-visible signal.
> > > > >
> > > > > MB1/MB2/MB3:
> > > > >
> > > > > I have addressed these changes in the KIP.
> > > > >
> > > > > Best Regards,
> > > > > Jiunn-Yang
> > > > >
> > > > >> Chia-Ping Tsai <[email protected]> 於 2026年4月29日 下午4:04 寫道:
> > > > >>
> > > > >> hi David
> > > > >>
> > > > >> I agree with the direction of moving the 'age' resolution from the
> > > > Heartbeat API to the Metadata API to keep the control plane clean. The
> > main
> > > > trade-off, as we noted before, is introducing inter-broker clock skew.
> > The
> > > > Group Coordinator approach provided a single source of truth for time.
> > > > >>
> > > > >> However, realistically, this time skew should be negligible. Given
> > that
> > > > the max.age threshold will likely be configured in minutes or hours, a
> > > > typical NTP skew (in milliseconds) between brokers won't impact the
> > > > fallback decision.
> > > > >>
> > > > >> Best,
> > > > >> Chia-Ping
> > > > >>
> > > > >>> David Jacot via dev <[email protected]> 於 2026年4月29日 下午3:29 寫道:
> > > > >>>
> > > > >>> Hi all,
> > > > >>>
> > > > >>> Thanks for the KIP!
> > > > >>>
> > > > >>> Sorry, I haven't really followed the previous conversation but I
> > took a
> > > > >>> quick look at this one.
> > > > >>>
> > > > >>> DJ01: I don't clearly understand the flow with the
> > > > ConsumerGroupHeartbeat
> > > > >>> API after reading the KIP. There is a new boolean; the KIP states
> > that
> > > > >>> partition ages are returned only when this boolean is set.
> > Implicitly,
> > > > this
> > > > >>> means that when the consumer receives a new partition, it will
> > issue a
> > > > new
> > > > >>> HB request with the boolean set to receive the ages. Is my
> > > > understanding
> > > > >>> correct? We should perhaps clarify the flow and also explain how it
> > > > fits
> > > > >>> into the existing flow (e.g. list offsets, fetch offsets, etc.).
> > > > >>> DJ02: It my understanding is correct, I wonder if
> > > > >>> the ConsumerGroupHeartbeat API is the right place for this given
> > that
> > > > a new
> > > > >>> round trip is done anyway. Alternatively, it could simply include
> > the
> > > > >>> metadata. Generally, we should be rather cautious about not
> > overloading
> > > > >>> the ConsumerGroupHeartbeat API with unrelated concepts. The API is
> > a
> > > > >>> control plane API for assigning or revoking partitions. The fact
> > that
> > > > we
> > > > >>> don't want to add it to the corresponding Streams API also suggests
> > > > >>> something is not quite right. What would we do if we want to
> > support
> > > > >>> Streams in the future?
> > > > >>>
> > > > >>> Best,
> > > > >>> David
> > > > >>>
> > > > >>>> On Wed, Apr 29, 2026 at 12:28 AM Muralidhar Basani via dev <
> > > > >>>> [email protected]> wrote:
> > > > >>>>
> > > > >>>> Hi Jiunn,
> > > > >>>>
> > > > >>>> Thank you for this great kip. Good to know about the gap.
> > > > >>>>
> > > > >>>> mb-0 - why a new v2 version bump for RequestPartitionAges field.
> > Can a
> > > > >>>> tagged field (for ex: on response, PartitionAges on
> > TopicPartitions)
> > > > be
> > > > >>>> used here and avoid version bump?
> > > > >>>>
> > > > >>>> mb-1 - For the new config, is there a recommended value or a
> > ConfigDef
> > > > >>>> validator? Probably it should based on the metadata.max.age.ms ?
> > > > Sizing
> > > > >>>> instructions can be part of javadocs I guess.
> > > > >>>>
> > > > >>>> mb-2 - (minor) As there are no changes to Kafka Streams, would it
> > be
> > > > better
> > > > >>>> to add this new config auto.offset.reset.latest.max.age to the
> > > > >>>> StreamsConfig block list
> > (NON_CONFIGURABLE_CONSUMER_DEFAULT_CONFIGS)
> > > > for a
> > > > >>>> clear warning, incase users configure it? This is the most
> > familiar
> > > > >>>> consumer config and users might easily mistakenly configure it. Or
> > > > may be
> > > > >>>> it's not worth it to add.
> > > > >>>>
> > > > >>>> mb-3 - (minor) The phrasing "the consumer falls back to earliest"
> > > > reads as
> > > > >>>> if the config were being changed per-partition which isn't
> > supported.
> > > > May
> > > > >>>> be rephrasing to something like "consumer resolves the initial
> > > > position to
> > > > >>>> start offset for that partition" as if earliest was applied to
> > that
> > > > >>>> partition only and auto.offset.reset config is unchanged.
> > > > >>>>
> > > > >>>> Thanks,
> > > > >>>> Murali
> > > > >>>>
> > > > >>>>> On Tue, Apr 28, 2026 at 2:48 PM 黃竣陽 <[email protected]> wrote:
> > > > >>>>>
> > > > >>>>> Hi chia,
> > > > >>>>>
> > > > >>>>> I have updated the KIP to include this change.
> > > > >>>>>
> > > > >>>>> Best Regards,
> > > > >>>>> Jiunn-Yang
> > > > >>>>>
> > > > >>>>>> Chia-Ping Tsai <[email protected]> 於 2026年4月28日 晚上8:03 寫道:
> > > > >>>>>>
> > > > >>>>>> hi Jiunn-Yang
> > > > >>>>>>
> > > > >>>>>> chia_0: Should we expose the partition creation time via the
> > Admin
> > > > API?
> > > > >>>>> I assume it would be valuable for users to diagnose and
> > troubleshoot
> > > > the
> > > > >>>>> behavior of auto.offset.reset.latest.max.age
> > > > >>>>>>
> > > > >>>>>> Best,
> > > > >>>>>> Chia-Ping
> > > > >>>>>>
> > > > >>>>>> On 2026/04/28 10:47:58 黃竣陽 wrote:
> > > > >>>>>>> Hello everyone,
> > > > >>>>>>>
> > > > >>>>>>> I would like to start a discussion on KIP-1327 Prevent Hot Data
> > > > Loss
> > > > >>>> on
> > > > >>>>> Partition Expansion for Latest Policy
> > > > >>>>>>> <
> > > > >>>>
> > > >
> > https://urldefense.com/v3/__https://cwiki.apache.org/confluence/x/KY4mGQ__;!!Ayb5sqE7!qF4q1QzF1RRgP61D7A2xuEai1ky7fepKDKFFvpNBuePikH-ULmT87TvuuZzy5kau5E4y5zMZAmfQQiwZomM$
> > > > >>>>>
> > > > >>>>>>>
> > > > >>>>>>> This proposal aims to introduces
> > auto.offset.reset.latest.max.age,
> > > > a
> > > > >>>>> consumer config that lets the
> > > > >>>>>>> latest reset policy distinguish newly expanded (hot) partitions
> > > > from
> > > > >>>>> long-existing (cold) ones. Partitions
> > > > >>>>>>> younger than the configured threshold automatically fall back
> > to
> > > > >>>>> earliest, preventing silent data loss
> > > > >>>>>>> during topic expansion without forcing a full historical
> > reprocess.
> > > > >>>>>>>
> > > > >>>>>>> Best regards,
> > > > >>>>>>> Jiunn-Yang
> > > > >>>>>
> > > > >>>>>
> > > > >>>>
> > > > >
> > > >
> > > >
> > >
> >
> 

Reply via email to