tgravescs commented on code in PR #42352:
URL: https://github.com/apache/spark/pull/42352#discussion_r1583032683
##########
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala:
##########
@@ -340,6 +385,45 @@ private[spark] class ExecutorAllocationManager(
}
}
+ /**
+ * Maximum number of executors to be removed per dra evaluation.
Review Comment:
It could potentially be relevant to batch queries, but the algorithm for
increasing the number of executors is also supposed to help not allocate to
many in the first place. Which also isn't ideal for some workloads. I would
say if you have suggestions and data proving the some drain algorithm works
better then the existing then we should make it configurable and add it.
Removing and added executors can be very application dependent. you could
remove something and then need it a few seconds later... you could have a long
tail task that takes an hour longer then everything else and removing those
asap is going to save you money/get better cluster utilization. Streaming case
is naturally going to be different from many batch workloads.
I do think it should be a separate pr and discussion that adds it to batch
though.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]