a2l007 opened a new pull request #10213: URL: https://github.com/apache/druid/pull/10213
Fixes #10193. `CuratorLoadQueuePeon` no longer deletes segment load/drop entries in case `druid.coordinator.load.timeout` expires. Deleting these entries after a timeout can cause the balancer to work incorrectly, as described in the linked issue. With this fix, the segment entries will remain in the load/drop queue for a peon until the ZK entry is deleted by the historical, unless a non-timeout related exception occurs. This helps the balancer to account for the actual queue size for historicals and can lead to better balancing decisions. <hr> This PR has: - [x] been self-reviewed. - [x] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.) - [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader. - [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met. - [x] been tested in a test Druid cluster. <!-- Check the items by putting "x" in the brackets for the done things. Not all of these items apply to every PR. Remove the items which are not done or not relevant to the PR. None of the items from the checklist above are strictly necessary, but it would be very helpful if you at least self-review the PR. --> ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
