[ 
https://issues.apache.org/jira/browse/KAFKA-10518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302075#comment-17302075
 ] 

Colin McCabe commented on KAFKA-10518:
--------------------------------------

One simple workaround is to increase max.poll.records so that you get more 
records per fetch for the partition that has a high rate of new records.

I do wonder whether there's any useful purpose served by "explicitly 
exclud[ing] partitions for which the consumer received data in the previous 
round".  It seems a bit like an implementation hack based on how we do 
buffering in the consumer, but I could be missing something....

> Consumer fetches could be inefficient when lags are unbalanced
> --------------------------------------------------------------
>
>                 Key: KAFKA-10518
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10518
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Dhruvil Shah
>            Priority: Major
>
> Consumer fetches are inefficient when lags are imbalanced across partitions, 
> due to head of the line blocking and the behavior of blocking for 
> `max.wait.ms` until data is available.
> When the consumer receives a fetch response, it prepares the next fetch 
> request and sends it out. The caveat is that the subsequent fetch request 
> would explicitly exclude partitions for which the consumer received data in 
> the previous round. This is to allow the consumer application to drain the 
> data for those partitions, until the consumer fetches the other partitions it 
> is subscribed to.
> This behavior does not play out too well if the consumer is consuming when 
> the lag is unbalanced, because it would receive data for the partitions it is 
> lagging on, and then it would send a fetch request for partitions that do not 
> have any data (or have little data). The latter will end up blocking for 
> fetch.max.wait.ms on the broker before an empty response is sent back. This 
> slows down the consumer’s overall consumption throughput since 
> fetch.max.wait.ms is 500ms by default.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to