tgravescs commented on code in PR #42352:
URL: https://github.com/apache/spark/pull/42352#discussion_r1583032683


##########
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala:
##########
@@ -340,6 +385,45 @@ private[spark] class ExecutorAllocationManager(
     }
   }
 
+  /**
+   * Maximum number of executors to be removed per dra evaluation.

Review Comment:
   It could potentially be relevant to batch queries, but the algorithm for 
increasing the number of executors is also supposed to help not allocate to 
many in the first place.  Which also isn't ideal for some workloads.  I would 
say if you have suggestions and data proving the some drain algorithm works 
better then the existing then we should make it configurable and add it.  
Removing and added executors can be very application dependent.  you could 
remove something and then need it a few seconds later... you could have a long 
tail task that takes an hour longer then everything else and removing those 
asap is going to save you money/get better cluster utilization.  Streaming case 
is naturally going to be different from many batch workloads.
   
   I do think it should be a separate pr and discussion that adds it to batch 
though.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to