Hi David,

> In your missing semantic section, I don’t fully understand how the 4th
point is improved by the KIP. It says start from earliest but with the
change it would start from application start time. Could you elaborate?

Thanks for the question. With by_start_time, the consumer seeks using its 
startup timestamp. 
So after log truncation, any remaining records with timestamps at or after that 
startup time are still 
reachable through the timestamp-based lookup, rather than being silently 
skipped. 

By contrast, latest simply jumps to the current end of the partition and 
ignores all existing data. 
The improvement here is that the timestamp-based approach preserves access to 
post-startup 
data that survives truncation, whereas latest offers no such guarantee. 
I will update the KIP wording to make this clearer.

Best Regards,
Jiunn-Yang


> Chia-Ping Tsai <[email protected]> 於 2026年3月6日 下午5:12 寫道:
> 
> hi David
> 
>> I was also considering this solution while we discussed in the jira. It
>> seems to work in most of the cases but not in all. For instance, let’s
>> imagine a partition created just before a new consumer joins or rejoins the
>> group and this consumer gets the new partition. In this case, the consumer
>> will have a start time which is older than the partition creation time.
>> This could also happen with the truncation case. It makes the behavior kind
>> of unpredictable again.
> 
> Using a server-side timestamp means comparing the Group Coordinator's time 
> against the Partition Leader's time (which is often a different node). 
> Without strict clock synchronization in Kafka, this "happens-before" 
> relationship remains fundamentally unpredictable.
> 
>> Instead of relying on a local timestamp, one idea would to rely on a
>> timestamp provided by the server. For instance, we could define the time
>> since the group became non-empty. This would define the subscription time
>> for the consumer group. The downside is that it only works if the consumer
>> is part of a group.
> 
> auto.offset.reset is strictly a client-level config. Consumers within the 
> same group can intentionally use different policies. Tying this to a global 
> group state feels like a semantic mismatch. A local timestamp aligns much 
> better, similar to how by_duration operates.

Reply via email to