mridulm commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-826506510


   As I mentioned in the doc, are we are trying to retrofit scenarios that 
Spark is not trying to handle ? Namely: some task for some stage must only run 
on a particular executor and not run anywhere else.
   I agree with @cloud-fan that there are too many interacting aspects that 
need to be carefully looked at here (resource allocation, fault tolerance, 
utilization, infinite wait for schedule, etc).
   
   On other hand, the usecase @tgravescs mentioned is an interesting one - how 
to change schedule behavior towards specific resource usage patterns : like 
bin-packing executors, etc. I think there have been past PR's towards that 
(particularly in context of elastic cloud env)
   Those require a global view to make decisions though, not just for a single 
executor.
   
   Making task scheduling pluggable would be an interesting experiment, but 
this has to be approached carefully given the interactions. Also, from an 
interface point of view, we want to ensure it is not specific to a single 
usecase.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to