viirya commented on pull request #32136:
URL: https://github.com/apache/spark/pull/32136#issuecomment-821305514
> I didn't see an API for this? informScheduledTask is called at scheduling
so my assumption is the plugin may be keeping state about where things have
went and if things reset I would have expected an API call to let the plugin
know. in TaskSchedulerImpl.resourceOffers it may have assigned some tasks but
then if it doesn't get all needed for barrier then it resets them. Maybe your
intention is that it doesn't keep state? The intention is that api would just
throw?
This is good point. I agree we may need an API to be informed that the task
assignment was reset. For my user-case, it is not related to barrier tasks, but
you're right that if the plugin works with barrier tasks, it should be informed
in case it keeps some states.
> so I don't completely understand this. Are you just saying that the
locality is not specific enough? I get the first micro-batch case kind of
especially perhaps in the dynamic allocation type case - is that the case here,
seems like you kind of hint at it above in a comment, but don't understand in
other cases. Have you tried the newer locality algorithm vs the old one?
>
> Does this come down to you really just want scheduler to force evenly
distributed and then after that locality should work? It seems like you are
saying it needs more then that though and locality isn't enough, would like to
understand why.
> > non-trivial locality config (e.g., 10h)
>
> I'm not sure what that means? do you just mean it has more logic in
figuring out the locality?
For the other cases, I think generally locality should work for the purpose.
It is how we make state store location stable right now, using locality. In the
above comment, it is more like a guess that I'm wondering if a global locality
that we need for forcing state store location unchanged across micro-batches,
would also be suitable for other stages which no state store is involved.
Because to force state store location unchanged, it means we need a
long-enough locality value to ask Spark to delay scheduling stateful tasks. But
as we know it may not respect cluster resource utilization. So it could be bad
for other stages which do not have state store.
> Overall I'm fine with having some sort of a plugin to allow people to
experiment but I also want it generic enough to cover the cases I mentioned and
for it to not cause problems where people can shoot themselves and then
complain as to why things aren't working. It would be nice to really understand
this case to see if that is needed or if just something else can be improved
for all people benefit.
Sure. Appreciate your comment and questions.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]