Dhruvil Shah created KAFKA-10517:
------------------------------------
Summary: Inefficient consumer processing with fetch sessions
Key: KAFKA-10517
URL: https://issues.apache.org/jira/browse/KAFKA-10517
Project: Kafka
Issue Type: Bug
Reporter: Dhruvil Shah
With the introduction of fetch sessions, the consumer and the broker share a
unified view of the partitions being consumed and their current state
(fetch_offset, last_propagated_hwm, last_propagated_start_offset, etc.). The
consumer is still expected to consume in a round robin manner, however, we have
observed certain cases where this is not the case.
Because of how we perform memory management on the consumer and implement fetch
pipelining, we exclude partitions from a FetchRequest when they have not been
drained by the application. This is done by adding these partitions to the
`toForget` list in the `FetchRequest`. When partitions are added to the
`toForget` list, the broker removes these partitions from its session cache.
This causes bit of a divergence between the broker's and the client's view of
the metadata.
When forgotten partitions are added back to the Fetch after the application
have drained them, the server will immediately add them back to the session
cache and return a response for them, even if there is no corresponding data.
This re-triggers the behavior on the consumer to put this partition on the
`toForget` list incorrectly, even though no data for the partition may have
been returned.
We have seen this behavior to cause an imbalance in lags across partitions as
the consumer no longer obeys the round-robin sequence given that the partitions
keep shuffling between the `toForget` and `toSend` lists.
At a high level, this is caused due to the out of sync session caches on the
consumer and broker. This ends up in a state where the partition balance is
being maintained by external factors (such as whether metadata was returned for
a partition), rather than following the round-robin ordering.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)