liupc edited a comment on issue #24167: [SPARK-27214]Upgrading locality level 
when task set is starving
URL: https://github.com/apache/spark/pull/24167#issuecomment-476452893
 
 
   @squito  Thank you for your reply. I think this PR is useful in batch 
processing when a single job is submitted. 
   
   > to some extent, this was the intended behavior for when there were 
multiple active jobs in a "job server" style deployment. Then it is OK for one 
job to end up waiting a while, to keep resources free for another job which 
might be able to use those resources with better locality.
   
   I don't agree with this, because it's non-deterministic, you can hardly say 
the delay scheduling brings gains or pains. What's more, I think this multi-job 
cases in SparkSQL thriftserver, these resources allocation which considering 
locality should be done in the Scheduler(Maybe not only the scheduling order 
but also some max limitation or other strategies) like 
FIFOScheduler/FairScheduler etc. And in multi-jobs cases with FIFO scheduler, 
if we let one job to end up waiting a long time while let later job running 
more resources, it seems breaks what user has expected: The first comes should 
be executed fast and finished first.
   
   > In the meanwhile, I've actually often advised users on large clusters to 
turn the locality wait down to 0 (the odds of getting locality goes down on 
larger clusters anyway). Have you considered that? Note that spark still tries 
to schedule for locality even with the wait = 0; it just doesn't wait until it 
gets the desired locality.
   
   Yes, set `spark.locality.wait=0` works in this case, and I just use this 
conf to temporarily fix this issue before this solution in batch mode. But I 
think set to zero is a little bit aggressive, and it's not easy for users to 
decide how large a cluster is a large cluster. So it's not bad to add this 
option for batch processing.
   
   > One big change that you're not calling out -- it always turns off 
delay-scheduling until some task has finished. That could be a very big change 
for some use cases.
   
   Yes, this change is big, it may breaks the delay scheduling, so I agree with 
your suggestion to add an option to enable it, or we can just enable it when 
the `SchedulingMode` is `FIFO`, this is mostly used in Batch processing.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to