viirya commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-845734433


   > So what if the state store resource is **required** not **optional**? It 
means, the task won't launch until getting the required state store. So in your 
PVC case, the task will wait until it re-mount to some executors. And if we 
make state store resource required, we should do the similar thing for the HDFS 
state store on executor lost. For example, we should reconstruct the state 
store on other active executors (or even we don't have to reconstruct the state 
store in reality but move the `StateStoreProviderId`s to other active 
executors' metadata (e.g., ExecutorData) should be enogh) so that the state 
store resources always exist and scheduling won't hang.
   
   No. In our use-case, we want to get rid of HDFS for state store checkpoint. 
So the task will wait until the PVC re-mounts to another new executor. Our 
state store is checkpointed to PVC, not HDFS.
   
   That is why I question about if stage level scheduling can handle such case. 
Because it is one of requirements of this proposed plugin API.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to