Re: [PR] KAFKA-16637: AsyncKafkaConsumer removes offset fetch responses from cache too aggressively [kafka]

via GitHub Tue, 11 Jun 2024 10:28:02 -0700


lianetm commented on PR #16241:
URL: https://github.com/apache/kafka/pull/16241#issuecomment-2161273163

High level comment about the approach. Getting into caching results of fetch
requests seems very tricky, given how easily they can become stale. The simple
dedup approach to ensure we don't send a fetch request if there is another one
in-flight for the same partitions seemed simple and safe (but not enough for
poll, agreed). The problem was: fetch requests issued from within the poll loop
were being expired with the same poll timeout, which could prevent the consumer
from ever fetching offsets on low poll timeout.

Sorry if I'm missing something, but couldn't we simply ensure that fetch
requests issued from the poll have no timeout, and we apply the timeout only
when getting responses. This means that we would leave the fetch request
running, and on the next call to poll it would be either completed (so we would
get the results), or in-flight (as long as the assignment does not change, we
wouldn't issue a new fetch). The change I refer to is this snippet from
[initWithCommittedOffsetsIfNeeded](https://github.com/apache/kafka/blob/aecaf4447561edd8da9f06e3abdf46f382dc9d89/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AsyncKafkaConsumer.java#L1670-L1675).
It currently does fetch with timeout + get results.
But what about just moving the timeout to the get result?
1. issue fetch (max_value as timeout) -> will run across poll iterations for
the same set of partitions
2. get result with poll timeout -> this ensures that we respect the poll
iteration timeout, while leaving fetch request running to be used on the next
poll.

Would this work to solve our problem? (while avoiding caching offsets on the
client).

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] KAFKA-16637: AsyncKafkaConsumer removes offset fetch responses from cache too aggressively [kafka]

Reply via email to