Matthias J. Sax created KAFKA-19236:
---------------------------------------

             Summary: Auto.offset.reset "by-duration" should reset a single time
                 Key: KAFKA-19236
                 URL: https://issues.apache.org/jira/browse/KAFKA-19236
             Project: Kafka
          Issue Type: Improvement
          Components: clients, consumer, streams
    Affects Versions: 4.0.0
            Reporter: Matthias J. Sax


KIP-1106 introduced a new option "by-duration" for config `auto.offset.reset`  
([https://cwiki.apache.org/confluence/display/KAFKA/KIP-1106%3A+Add+duration+based+offset+reset+option+for+consumer+clients)]

If a consumer tries to reset to a "future" time, the observed behavior is 
somewhat odd, and we should change it:

Assume there is a topic for which no new data was written for the last hour. A 
new consumer starts up at 1pm, and tries to reset by 10 minutes. There is no 
data for 12:50pm, so the consumer won't complete the reset, but will keep 
retrying (every 30 seconds by default) until it can resolve offsets – the issue 
is, that the "seek ts" (ie 12:50pm) is recomputed on every retry and thus move 
while the consumer wait.

Because the consumer could not resolve offset, but still has `offsets=null`, it 
keep re-executing the reset logic.

Hence, if there is not data for another 30 minutes, the consumer would now 
retry to find offset at 1:20pm. This is rather unexpected, as if one starts the 
consumer at 1:00pm and resets by 10 minutes, it's reasonable to assume that 
data would start flowing when the topic reaches 12:50pm, even if the consumer 
was idling for 30 minutes.

Thus, instead of executing the reset logic every 30 seconds, the reset logic 
should be executed once, to compute 12:50pm as "seek ts", and if the request 
returns `null` (ie, no offset found), the same request should be resent every 
30 second, w/o re-triggering the rest logic itself, to keep the "seek ts" at 
12:50pm.

Kafka Streams, which re-implement the by-duration reset logic by itself, has 
the same behavior as the consumer, and should be updated, as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to