[GitHub] [spark] tgravescs commented on pull request #32136: [WIP][SPARK-35022][CORE] Task Scheduling Plugin in Spark

GitBox Tue, 13 Apr 2021 06:56:52 -0700


tgravescs commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-818756850



   I would like to see more overview and design details.  I think the idea of 
having something here is good because some people may want to cluster tasks, 
while some might want to spread them. You might want to place them based on 
hardware or something else.  I want to understand how flexible this plugin api 
you are proposing it.
   
   I just saw 
https://docs.google.com/document/d/1wfEaAZA7t02P6uBH4F3NGuH_qjK5e4X05v1E5pWNhlQ/edit#
 which has a few details.  Would be good to link from description. 
   
   questions:
   1) what happens with locality? it looks like this is plugged in after 
locality, are you disabling locality then or it doesn't have any for your use 
case?   if we create a plugin to choose location I wouldn't necessarily want 
locality to take affect.
   2)  @param tasks The full list of tasks => this is all tasks even if done?  
Would you want to know which ones are running already or which was have 
succeeded
   3) this is being called from synchronized block, in the very least we need 
to document better and affects it could have on scheduling time
   4) it looks like your plugin runs before blacklisting, is this really what 
we want or would plugin like to know to make better decision?
   5) how does this interact with barrier where it resets things if it doesn't 
get scheduled?
   
   
   I would like to see how this applies to other use cases I mentioned above 
before putting this in.
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] tgravescs commented on pull request #32136: [WIP][SPARK-35022][CORE] Task Scheduling Plugin in Spark

Reply via email to