hi Jun I see what you mean now. The proposal from me is listed below:
1) Add auto.offset.reset.new.partitions with a default value of earliest. It fixes the data loss from both by_duration and latest, and it does not change the logic of auto.offset.reset=earliest. 2) Mark auto.offset.reset.new.partitions as an internal configuration. auto.offset.reset.new.partitions=earliest already addresses the issue, and we can discuss the use cases of other values in a separate KIP. 3) Both configs, auto.offset.reset.new.partitions and auto.offset.reset.latest.max.age.ms, will be applied to all for consistency. WDYT? On 2026/05/15 20:53:20 Jun Rao via dev wrote: > Hi, Chia-Ping, > > Thanks for the reply. > > 1. In the motivation section, the KIP says "When a Kafka topic is expanded > with new partitions, consumers using the latest auto offset reset policy > will silently miss all records produced to those partitions before the > consumer discovers them.". If a user sets > auto.offset.reset=by_duration=1sec, the same record loss issue could also > happen, right? > > 2. I was thinking auto.offset.reset.new.partitions will take the same > values as auto.offset.reset. So a user could set it by_duration if needed. > > Jun > > On Thu, May 14, 2026 at 4:06 PM Chia-Ping Tsai <[email protected]> wrote: > > > hi Jun > > > > Thanks for the feedback. I might be missing something important from your > > suggestion, so please bear with me as I try to clarify with a few questions: > > > > 1. Is there a strong use case for extending this logic to other reset > > policies? Unlike latest, policies like earliest or by_duration don't seem > > to suffer from the same silent data loss issue when a partition is expanded. > > > > 2. What values would we expect users to configure for > > auto.offset.reset.new.partitions? If they set it to earliest or latest, > > we might run into the exact same edge cases. For example, if a consumer is > > offline for a while and a new partition is created during that downtime, > > the user might actually want to skip to latest when resuming, rather than > > reading from earliest just because the partition is technically "new" to > > the group. > > > > This is exactly why we opted for introducing a max.age threshold. It gives > > users a time-bound way to define what is genuinely "hot/new" and what is > > just an old partition they haven't seen yet. > > > > Best, > > Chia-Ping > > > > On 2026/05/14 20:48:09 Jun Rao via dev wrote: > > > Hi, Jiunn-Yang, > > > > > > Thanks for the KIP. > > > > > > I find auto.offset.reset.latest.max.age a bit weird. It only applies when > > > auto.offset.reset is latest. However, it seems that the motivation > > equally > > > applies when auto.offset.reset is set to other values like by_duration. > > The > > > intention is that we want to have a separate way to control newly created > > > partitions vs existing partitions when the group starts. Have we > > considered > > > adding a new config like auto.offset.reset.new.partitions? If this new > > > config is not set, the offset reset policy defaults to the policy used > > for > > > existing partitions. The user could set it explicitly to customize the > > > behavior for new partitions. > > > > > > Jun > > > > > > On Thu, May 7, 2026 at 5:07 AM 黃竣陽 <[email protected]> wrote: > > > > > > > Hi all, > > > > > > > > I’d like to manually bump this thread. > > > > > > > > Best Regards, > > > > Jiunn-Yang > > > > > > > > > 黃竣陽 <[email protected]> 於 2026年5月1日 晚上10:37 寫道: > > > > > > > > > > Hello all, > > > > > > > > > > Thanks for the feedback. > > > > > > > > > > DJ01/DJ02: > > > > > > > > > > MetadataResponse bumps from v13 to v14. The PartitionMetadata struct > > > > gains a new > > > > > field PartitionAgeMs (int64, default -1), computed server-side by the > > > > broker as > > > > > broker_current_time - partition_creation_time. > > > > > > > > > > Also add the consumer heartbeat flow. when MembershipManager detects > > a > > > > newly assigned > > > > > partition, it explicitly invalidates the metadata for the affected > > topic > > > > and forces a fresh MetadataRequest > > > > > before making the offset reset decision, even if the topic ID is > > already > > > > in the cache. > > > > > > > > > > MB0: > > > > > > > > > > The consumer learns the broker's maximum supported MetadataResponse > > > > version via the > > > > > ApiVersions negotiation at connection time. If the negotiated > > version is > > > > unsupported, the consumer > > > > > knows the broker does not support PartitionAgeMs at all and can > > throw an > > > > UnsupportedVersionException > > > > > immediately, rather than silently falling back to latest and risking > > > > data loss without any operator-visible signal. > > > > > > > > > > MB1/MB2/MB3: > > > > > > > > > > I have addressed these changes in the KIP. > > > > > > > > > > Best Regards, > > > > > Jiunn-Yang > > > > > > > > > >> Chia-Ping Tsai <[email protected]> 於 2026年4月29日 下午4:04 寫道: > > > > >> > > > > >> hi David > > > > >> > > > > >> I agree with the direction of moving the 'age' resolution from the > > > > Heartbeat API to the Metadata API to keep the control plane clean. The > > main > > > > trade-off, as we noted before, is introducing inter-broker clock skew. > > The > > > > Group Coordinator approach provided a single source of truth for time. > > > > >> > > > > >> However, realistically, this time skew should be negligible. Given > > that > > > > the max.age threshold will likely be configured in minutes or hours, a > > > > typical NTP skew (in milliseconds) between brokers won't impact the > > > > fallback decision. > > > > >> > > > > >> Best, > > > > >> Chia-Ping > > > > >> > > > > >>> David Jacot via dev <[email protected]> 於 2026年4月29日 下午3:29 寫道: > > > > >>> > > > > >>> Hi all, > > > > >>> > > > > >>> Thanks for the KIP! > > > > >>> > > > > >>> Sorry, I haven't really followed the previous conversation but I > > took a > > > > >>> quick look at this one. > > > > >>> > > > > >>> DJ01: I don't clearly understand the flow with the > > > > ConsumerGroupHeartbeat > > > > >>> API after reading the KIP. There is a new boolean; the KIP states > > that > > > > >>> partition ages are returned only when this boolean is set. > > Implicitly, > > > > this > > > > >>> means that when the consumer receives a new partition, it will > > issue a > > > > new > > > > >>> HB request with the boolean set to receive the ages. Is my > > > > understanding > > > > >>> correct? We should perhaps clarify the flow and also explain how it > > > > fits > > > > >>> into the existing flow (e.g. list offsets, fetch offsets, etc.). > > > > >>> DJ02: It my understanding is correct, I wonder if > > > > >>> the ConsumerGroupHeartbeat API is the right place for this given > > that > > > > a new > > > > >>> round trip is done anyway. Alternatively, it could simply include > > the > > > > >>> metadata. Generally, we should be rather cautious about not > > overloading > > > > >>> the ConsumerGroupHeartbeat API with unrelated concepts. The API is > > a > > > > >>> control plane API for assigning or revoking partitions. The fact > > that > > > > we > > > > >>> don't want to add it to the corresponding Streams API also suggests > > > > >>> something is not quite right. What would we do if we want to > > support > > > > >>> Streams in the future? > > > > >>> > > > > >>> Best, > > > > >>> David > > > > >>> > > > > >>>> On Wed, Apr 29, 2026 at 12:28 AM Muralidhar Basani via dev < > > > > >>>> [email protected]> wrote: > > > > >>>> > > > > >>>> Hi Jiunn, > > > > >>>> > > > > >>>> Thank you for this great kip. Good to know about the gap. > > > > >>>> > > > > >>>> mb-0 - why a new v2 version bump for RequestPartitionAges field. > > Can a > > > > >>>> tagged field (for ex: on response, PartitionAges on > > TopicPartitions) > > > > be > > > > >>>> used here and avoid version bump? > > > > >>>> > > > > >>>> mb-1 - For the new config, is there a recommended value or a > > ConfigDef > > > > >>>> validator? Probably it should based on the metadata.max.age.ms ? > > > > Sizing > > > > >>>> instructions can be part of javadocs I guess. > > > > >>>> > > > > >>>> mb-2 - (minor) As there are no changes to Kafka Streams, would it > > be > > > > better > > > > >>>> to add this new config auto.offset.reset.latest.max.age to the > > > > >>>> StreamsConfig block list > > (NON_CONFIGURABLE_CONSUMER_DEFAULT_CONFIGS) > > > > for a > > > > >>>> clear warning, incase users configure it? This is the most > > familiar > > > > >>>> consumer config and users might easily mistakenly configure it. Or > > > > may be > > > > >>>> it's not worth it to add. > > > > >>>> > > > > >>>> mb-3 - (minor) The phrasing "the consumer falls back to earliest" > > > > reads as > > > > >>>> if the config were being changed per-partition which isn't > > supported. > > > > May > > > > >>>> be rephrasing to something like "consumer resolves the initial > > > > position to > > > > >>>> start offset for that partition" as if earliest was applied to > > that > > > > >>>> partition only and auto.offset.reset config is unchanged. > > > > >>>> > > > > >>>> Thanks, > > > > >>>> Murali > > > > >>>> > > > > >>>>> On Tue, Apr 28, 2026 at 2:48 PM 黃竣陽 <[email protected]> wrote: > > > > >>>>> > > > > >>>>> Hi chia, > > > > >>>>> > > > > >>>>> I have updated the KIP to include this change. > > > > >>>>> > > > > >>>>> Best Regards, > > > > >>>>> Jiunn-Yang > > > > >>>>> > > > > >>>>>> Chia-Ping Tsai <[email protected]> 於 2026年4月28日 晚上8:03 寫道: > > > > >>>>>> > > > > >>>>>> hi Jiunn-Yang > > > > >>>>>> > > > > >>>>>> chia_0: Should we expose the partition creation time via the > > Admin > > > > API? > > > > >>>>> I assume it would be valuable for users to diagnose and > > troubleshoot > > > > the > > > > >>>>> behavior of auto.offset.reset.latest.max.age > > > > >>>>>> > > > > >>>>>> Best, > > > > >>>>>> Chia-Ping > > > > >>>>>> > > > > >>>>>> On 2026/04/28 10:47:58 黃竣陽 wrote: > > > > >>>>>>> Hello everyone, > > > > >>>>>>> > > > > >>>>>>> I would like to start a discussion on KIP-1327 Prevent Hot Data > > > > Loss > > > > >>>> on > > > > >>>>> Partition Expansion for Latest Policy > > > > >>>>>>> < > > > > >>>> > > > > > > https://urldefense.com/v3/__https://cwiki.apache.org/confluence/x/KY4mGQ__;!!Ayb5sqE7!qF4q1QzF1RRgP61D7A2xuEai1ky7fepKDKFFvpNBuePikH-ULmT87TvuuZzy5kau5E4y5zMZAmfQQiwZomM$ > > > > >>>>> > > > > >>>>>>> > > > > >>>>>>> This proposal aims to introduces > > auto.offset.reset.latest.max.age, > > > > a > > > > >>>>> consumer config that lets the > > > > >>>>>>> latest reset policy distinguish newly expanded (hot) partitions > > > > from > > > > >>>>> long-existing (cold) ones. Partitions > > > > >>>>>>> younger than the configured threshold automatically fall back > > to > > > > >>>>> earliest, preventing silent data loss > > > > >>>>>>> during topic expansion without forcing a full historical > > reprocess. > > > > >>>>>>> > > > > >>>>>>> Best regards, > > > > >>>>>>> Jiunn-Yang > > > > >>>>> > > > > >>>>> > > > > >>>> > > > > > > > > > > > > > > > > > > >
