corkitse created KAFKA-19415:
--------------------------------
Summary: Improve fairness of partition fetch order when clients
fetch data to avoid partition starvation
Key: KAFKA-19415
URL: https://issues.apache.org/jira/browse/KAFKA-19415
Project: Kafka
Issue Type: Improvement
Reporter: corkitse
Currently, in ReplicaManager.readFromLog, the fetch order of partitions is
fixed for each request. When the first few partitions have a large backlog,
they may consume all the allowed bytes (maxBytes), causing partitions at the
end of the list to be starved and not return data in this fetch cycle. This can
lead to persistent high latency for some partitions, especially under heavy
load and when the partition order is stable.
This behavior breaks the relative independence of partitions and may cause
resource utilization to be suboptimal.
*Reference code*
{code:java}
readPartitionInfo.foreach { case (tp, fetchInfo) =>
val readResult = read(tp, fetchInfo, limitBytes, minOneMessage)
val recordBatchSize = readResult.info.records.sizeInBytes
// Once we read from a non-empty partition, we stop ignoring request and
partition level size limits
if (recordBatchSize > 0)
minOneMessage = false
limitBytes = math.max(0, limitBytes - recordBatchSize)
result += (tp -> readResult)
} {code}
*Proposed Change*
Shuffle the order of readPartitionInfo before iteration to avoid always reading
partitions in a fixed order.
*Motivation*
1 Avoids starvation of later partitions when earlier partitions have message
backlog, ensuring all partitions have a chance to be served.
2 Improves fairness and resource utilization in brokers serving multiple
partitions.
3 Has negligible impact on Kafka performance, as shuffling a small list of
partitions is very lightweight.
I will submit a related pull request (PR) later. See related PR (to be added).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)