squito commented on issue #24167: [SPARK-27214]Upgrading locality level when 
task set is starving
URL: https://github.com/apache/spark/pull/24167#issuecomment-476032313
 
 
   I think this is actually a pretty big change in behavior, and at the very 
least would need to go behind a conf.  I had a dicussion with Kay Ousterhout 
about this type of situation and delay scheduling (need to search for the jira 
...) -- I had not proposed this solution, but something along these lines, and 
she said that to some extent, this was the intended behavior for when there 
were multiple active jobs in a "job server" style deployment.  Then it is OK 
for one job to end up waiting a while, to keep resources free for another job 
which might be able to use those resources with better locality.
   
   In the meanwhile, I've actually often advised users on large clusters to 
turn the locality wait down to 0 (the odds of getting locality goes down on 
larger clusters anyway).  Have you considered that?  Note that spark still 
tries to schedule for locality even with the wait = 0; it just doesn't *wait* 
until it gets the desired locality.
   
   While your proposal is reasonable, its also really hard to say whether its a 
good idea in general.  One big change that you're not calling out -- it 
*always* turns off delay-scheduling until some task has finished.  That could 
be a very big change for some use cases.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to