gianm commented on code in PR #14545:
URL: https://github.com/apache/druid/pull/14545#discussion_r1255758033
##########
indexing-service/src/main/java/org/apache/druid/indexing/overlord/RemoteTaskRunner.java:
##########
@@ -1401,29 +1401,31 @@ public Collection<Worker>
markWorkersLazy(Predicate<ImmutableWorkerInfo> isLazyW
{
// skip the lock and bail early if we should not mark any workers lazy
(e.g. number
// of current workers is at or below the minNumWorkers of autoscaler
config)
- if (maxLazyWorkers < 1) {
- return Collections.emptyList();
+ if (lazyWorkers.size() >= maxLazyWorkers) {
+ return getLazyWorkers();
Review Comment:
Ah, yeah, okay, nothing except the reset on desync, of course 🙂. And also a
worker going completely offline and then coming back— that would also get it
removed from the lazy list. So, it can happen, but not during normal operation
for a stable worker.
The original idea behind "lazy" workers is that they should be terminated
ASAP once in the lazy state. The terminology is mostly a joke about how those
workers are lazy and haven't done any work in a while. Before being marked
"lazy", they would have been in the state "idle" (not running any tasks) for
the `workerIdleTimeout`.
On `blacklistedWorkers`, that's a different thing, meant to catch workers
that are having problems running tasks, so we don't schedule tasks on them and
have them keep failing.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]