Shekharrajak opened a new pull request, #22572: URL: https://github.com/apache/kafka/pull/22572
Fix https://issues.apache.org/jira/browse/KAFKA-20691 Current merge rescans the TreeSet from the start on every step. 10k contiguous same-state batches → 10k rescans of a shrinking set, surfacing as non-linear WriteShareGroupState latency when many small acknowledge ranges accumulate per partition. How: - Emit (firstOffset, BEGIN) and (lastOffset + 1, END) for each pruned batch. - Sort events by (offset asc, END before BEGIN). - Sweep events; keep a max-priority PriorityQueue of active batches + lazy-deletion HashSet. - At every offset advance, emit a slice with the heap-top's (state, count); coalesce contiguous slices with identical (state, count). Result - Worst case: O(n^2) → O((n + k) log n). - Public API unchanged. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
