[
https://issues.apache.org/jira/browse/KAFKA-4007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15708070#comment-15708070
]
Ismael Juma commented on KAFKA-4007:
------------------------------------
I think it's worth clarifying that a new fetch request is sent if the previous
data for a partition is exhausted, so if the consumer is consuming from
multiple partitions, then we won't necessarily be waiting after every every
request with max.poll.records. KAFKA-4405 is also relevant where the cost of
prefetching after every `poll` seems to cause lower performance for the streams
benchmark (and KAFKA-4469 that reduces some of the overhead when
max.poll.records is smaller than the fetch size).
> Improve fetch pipelining for low values of max.poll.records
> -----------------------------------------------------------
>
> Key: KAFKA-4007
> URL: https://issues.apache.org/jira/browse/KAFKA-4007
> Project: Kafka
> Issue Type: Improvement
> Components: consumer
> Reporter: Jason Gustafson
> Assignee: Mickael Maison
>
> Currently the consumer will only send a prefetch for a partition after all
> the records from the previous fetch have been consumed. This can lead to
> suboptimal pipelining when max.poll.records is set very low since the
> processing latency for a small set of records may be small compared to the
> latency of a fetch. An improvement suggested by [~junrao] is to send the
> fetch anyway even if we have unprocessed data buffered, but delay reading it
> from the socket until that data has been consumed. Potentially the consumer
> can delay reading _any_ pending fetch until it is ready to be returned to the
> user, which may help control memory better.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)