viirya commented on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-833680616
> I don't understand this. if the executor is lost isn't your state store lost? if its just on the host separate from executor, then host locality should still apply. Or are you referring to some new checkpoint mechanism where after a failure you would want the locality to change to where that checkpoint is? For example, we let state store checkpoint to a persistent volume on the executor, instead of HDFS. Once the executor is lost, we are able to mount the pv back on a new executor or another existing executor. Now, we don't want the state store to be assigned to any executor, but the executor with the pv with checkpointed state. > I'm definitely not against improving it or scheduling in general, and I never said that. I'm hesitant about the current proposal and implementation. I want clear goals and use cases it applies to, in order to make sure its implemented and solved in the proper way. I think my biggest complaint here is there is no complete overview or details, this kind of has piece meal explanations that when I look at seem to have holes in it, so I don't have a complete picture. I'm not sure this is the best place to plugin at if you really want flexibility. Sorry if I didn't state clearly the goals and our use-cases, although I thought I already explained above during the discussion. Let me reclaim it: We need the ability to have tasks (basically stateful tasks matter to us) scheduled on the executors we want. In other words, we need the ability to control where a task is scheduled to. Our goal is to improve the SS state store scheduling and checkpoint. Currently users can only rely on locality. However, locality has first-batch issue (explained in previous comment). Also, relying on user-facing config to solve platform problem seems a flaky approach. Not every user knows to set it. So it is clear that our use-case is for SS jobs with state store. One use-case of checkpoint with pv is stated at the beginning. The use-case is specific to SS. But, as Spark doesn't have scheduling plugin/API, we cannot limit the change only to SS module. > For instance I asked about this plugging in after locality and you said "If users don't want locality to take effect, it is doable by disabling locality configs". But this disables it for the entire job. If you just want spread on the first stage for instance and then everything after that to have locality, that doesn't work. Let's revisit the discussion: You question: > 1. what happens with locality? it looks like this is plugged in after locality, are you disabling locality then or it doesn't have any for your use case? if we create a plugin to choose location I wouldn't necessarily want locality to take affect. Because you said if we create a plugin, you wouldn't want locality to take affect. So my answer is, you can disable the locality, if you doesn't want locality to take affect. I hope that can clarify it. Let me know if I misunderstand your questions. > so I know there is a doc associated with this, but I think it's not complete enough. I think it should go into more detail about specific problem, why locality isn't sufficient (or where locality isn't sufficient (first stage)), how those things interact, what all use cases this applies to, what use cases it doesn't solve or deficiencies in doing it this way. how does this really flow with SS in the different cases where executor lost and state store, explain it for someone who might not be familiar with it. @mridulm brought up a bunch of issues above in his comment as well. Okay. Let me enrich the doc with more details. Thanks for the suggestions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
