HenryCaiHaiying opened a new pull request, #16453: URL: https://github.com/apache/iceberg/pull/16453
This PR addresses the issue mentioned in https://github.com/apache/iceberg/issues/16361 Previously, each DATA_COMPLETE envelope triggered a re-scan of the entire readyBuffer to count received partitions, making per-commit work O(N^2) in the number of buffered messages. Under control-topic backlog this compounded the backlog and made recovery harder. The fix: maintain a running receivedPartitionCount that is incremented in addReady() and reset in endCurrentCommit(). isCommitReady() becomes a constant-time comparison against expectedPartitionCount. The change looks simple but we also verified the edge cases on when the current commit failed and a new start_commit starts, also in the situation when there is 2 Coordinator running (the previous Coordinator didn't terminate and comes back as a zombie coordinator). In both situations, we maintain the logic before and post the code change. Added 2 new unit tests to cover the situation before and after the commit and a situation when there is a DATA message generated from a zombie coordinator. Closes #16361 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
