Ngone51 commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-846336246


   > If the persistent volume is a resource, then it will have to be there on 
executor startup, so I guess a new executor checks for it on startup and 
advertises it. At that point, how does a task tell the difference between its 
state store and another one? The scheduler for resources only checks that the 
number of a particular resource match the task requirements, it doesn't 
differentiation different state store "ids". So it will assign a task to any 
executor with a PV.
   
   @tgravescs To clarify, I think it's state store rather than PV is the 
resource here. And the use case here might be a bit different from the classic 
use case(where the resources must be specified for executor launch). In this 
case, the executor can tell the state store resource to the driver only when 
the first time the state store instance is constructed. That means this special 
resource will be updated at runtime. (Streaming has the existing event 
`ReportActiveInstance` to report the state store instances, and we can extend 
it to update the executor's state store resources). To follow the existing 
custom resource management, the state store resource might be maintained as 
("statestore" -> list of statestore ids) as a part of the custom resource. 
   
   
   > another point - What if entire node is lost not just executor? I guess the 
discovery script starting up new executor would load it from HDFS???
   
   So as mentioned above, we don't need the discovery script here. But yes the 
state store instances can be loaded from HDFS.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to