lianetm commented on PR #16241:
URL: https://github.com/apache/kafka/pull/16241#issuecomment-2161273163

   High level comment about the approach. Getting into caching results of fetch 
requests seems very tricky, given how easily they can become stale. The simple 
dedup approach to ensure we don't send a fetch request if there is another one 
in-flight for the same partitions seemed simple and safe (but not enough for 
poll, agreed). The problem was: fetch requests issued from within the poll loop 
were being expired with the same poll timeout, which could prevent the consumer 
from ever fetching offsets on low poll timeout. 
   
   Sorry if I'm missing something, but couldn't we simply ensure that fetch 
requests issued from the poll have no timeout, and we apply the timeout only 
when getting responses. This means that we would leave the fetch request 
running, and on the next call to poll it would be either completed (so we would 
get the results), or in-flight (as long as the assignment does not change, we 
wouldn't issue a new fetch). The change I refer to is this snippet from 
[initWithCommittedOffsetsIfNeeded](https://github.com/apache/kafka/blob/aecaf4447561edd8da9f06e3abdf46f382dc9d89/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AsyncKafkaConsumer.java#L1670-L1675).
 It currently does fetch with timeout + get results. 
    But what about just moving the timeout to the get result?
   1. issue fetch (max_value as timeout) -> will run across poll iterations for 
the same set of partitions 
   2. get result with poll timeout -> this ensures that we respect the poll 
iteration timeout, while leaving fetch request running to be used on the next 
poll.
   
   Would this work to solve our problem? (while avoiding caching offsets on the 
client).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to