Alexandre Dupriez created KAFKA-15678: -----------------------------------------
Summary: [Tiered Storage] Stall remote reads with long-spanning transactions Key: KAFKA-15678 URL: https://issues.apache.org/jira/browse/KAFKA-15678 Project: Kafka Issue Type: Bug Components: Tiered-Storage Affects Versions: 3.6.0 Reporter: Alexandre Dupriez Hi team, I am facing an issue on the remote data path for uncommitted reads. As mentioned in [the original PR|https://github.com/apache/kafka/pull/13535#discussion_r1166887367], if a transaction spans over a long sequence of segments, the time taken to retrieve the producer snapshots from the remote storage can, in the worst case, become redhibitory and block the reads if it consistently exceed the deadline of fetch requests ({{{}fetch.max.wait.ms{}}}). Essentially, the method used to compute the uncommitted records to return have an asymptotic complexity proportional to the number of segments in the log. That is not a problem with local storage since the constant factor to traverse the files is small enough, but that is not the case with a remote storage which exhibits higher read latency. An aggravating factor was the lock contention in the remote index cache which was then mitigated by KAFKA-15084. But unfortunately, despite the improvements observed without said contention, the algorithmic complexity of the current method used to compute uncommitted records can always defeat any optimisation made on the remote read path. Maybe we could start thinking (if not already) about a different construct which would reduce that complexity to O(1) - i.e. to make the computation independent from the number of segments and irrespective of the transaction spans. -- This message was sent by Atlassian Jira (v8.20.10#820010)