viirya commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-857859603


   > Yes. I'm thinking a bit more: we probably even don't need the stage level 
scheduling API ability. After knowing the "mapping", we can use it directly in 
resourcesMeetTaskRequirements. The "mapping" is actually a hard-coded task 
requirement, and use stage level scheduling API ability to specify that 
requirement looks redundant and unnecessary.
   
   Sounds making sense. So let me rephrase it, and correct me if I 
misunderstand it.
   
   Basically, we introduce new task location `StateStoreTaskLocation`. The RDDs 
using statestore return this kind of task location as preferred locations.
   
   When `TaskSetManager` builds the pending task list, it could establish a 
mapping from the locations. The mapping could be between specific resources 
(e.g. PVC) and task (i.e. state store). `resourcesMeetTaskRequirements` 
directly uses the mapping to schedule tasks.
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to