viirya commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-819155836


   > I just saw 
https://docs.google.com/document/d/1wfEaAZA7t02P6uBH4F3NGuH_qjK5e4X05v1E5pWNhlQ/edit#
 which has a few details. Would be good to link from description.
   
   Linked it from the description. Thanks for reminding.
   
   
   > questions:
   > 
   >     1. what happens with locality? it looks like this is plugged in after 
locality, are you disabling locality then or it doesn't have any for your use 
case?   if we create a plugin to choose location I wouldn't necessarily want 
locality to take affect.
   
   For now it still respects locality. My thought is to not interfere the 
scheduler too much. For each locality level, the scheduler will try to pick up 
one task from a list of tasks for the particular locality level. The API jumps 
in at the moment and let the scheduler know which tasks are most preferred on 
the executor. If users don't want locality to take effect, it is doable by 
disabling locality configs. Otherwise, from the API perspective, if it doesn't 
want a particular task to be on an executor, it can also let the scheduler know 
(aka, don't have it in the returned list).
   
   >     2. @param tasks The full list of tasks => this is all tasks even if 
done?  Would you want to know which ones are running already or which was have 
succeeded
   
   We may not need to know. The parameter can be discussed. As `taskIndexes` 
are indexes to the full task list, it is easier for us to get task from the 
index. Can be the subset of tasks, i.e. the tasks of the passed in task indexes 
pointing to, if it is preferred.
   
   >     3. this is being called from synchronized block, in the very least we 
need to document better and affects it could have on scheduling time
   
   Yea, I thought about it, but forgot to add the comment. I will do it in next 
commit.
   
   >     4. it looks like your plugin runs before blacklisting, is this really 
what we want or would plugin like to know to make better decision?
   
   Good point. I think the reason it runs before blacklisting is that it is 
easier to fit into current logic and seems safer. Currently the scheduler 
iterates each task from the list, and then checks blacklist before picking it 
up. If I let the API gets a list after blacklisting, seems it might be a larger 
change to the dequeue logic.
   
   >     5. how does this interact with barrier where it resets things if it 
doesn't get scheduled?
   
   IIUC, the plugin does not affect or change how the scheduler acts on barrier 
tasks. In the current dequeue logic, the scheduler doesn't have different 
behavior on barrier task/general task. For now if the scheduler cannot schedule 
all barrier tasks at once, it will reset the assigned resource offers. It is 
the same with the plugin.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to