HenryCaiHaiying opened a new pull request, #16453:
URL: https://github.com/apache/iceberg/pull/16453

   This PR addresses the issue mentioned in 
https://github.com/apache/iceberg/issues/16361
   
   Previously, each DATA_COMPLETE envelope triggered a re-scan of the entire 
readyBuffer to count received partitions, making per-commit work O(N^2) in the 
number of buffered messages. Under control-topic backlog this compounded the 
backlog and made recovery harder.
   
   The fix: maintain a running receivedPartitionCount that is incremented in 
addReady() and reset in endCurrentCommit(). isCommitReady() becomes a 
constant-time comparison against expectedPartitionCount.
   
   The change looks simple but we also verified the edge cases on when the 
current commit failed and a new start_commit starts, also in the situation when 
there is 2 Coordinator running (the previous Coordinator didn't terminate and 
comes back as a zombie coordinator). In both situations, we maintain the logic 
before and post the code change. Added 2 new unit tests to cover the situation 
before and after the commit and a situation when there is a DATA message 
generated from a zombie coordinator.
   
   Closes #16361


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to