a2l007 opened a new pull request #10213:
URL: https://github.com/apache/druid/pull/10213
Fixes #10193.
`CuratorLoadQueuePeon` no longer deletes segment load/drop entries in case
`druid.coordinator.load.timeout` expires. Deleting these entries after a
timeout can cause the balancer to work incorrectly, as described in the linked
issue.
With this fix, the segment entries will remain in the load/drop queue for a
peon until the ZK entry is deleted by the historical, unless a non-timeout
related exception occurs. This helps the balancer to account for the actual
queue size for historicals and can lead to better balancing decisions.
<hr>
This PR has:
- [x] been self-reviewed.
- [x] using the [concurrency
checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md)
(Remove this item if the PR doesn't have any relation to concurrency.)
- [x] added comments explaining the "why" and the intent of the code
wherever would not be obvious for an unfamiliar reader.
- [x] added unit tests or modified existing tests to cover new code paths,
ensuring the threshold for [code
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
is met.
- [x] been tested in a test Druid cluster.
<!-- Check the items by putting "x" in the brackets for the done things. Not
all of these items apply to every PR. Remove the items which are not done or
not relevant to the PR. None of the items from the checklist above are strictly
necessary, but it would be very helpful if you at least self-review the PR. -->
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]