tgravescs commented on pull request #32804:
URL: https://github.com/apache/spark/pull/32804#issuecomment-866006388
thanks for working on this, it looks very interesting
> Add LocalityPreferredSchedulingRequestContainerPlacementStrategy for
compute locality for SchedulingRequest
Can you add more description about this? This seems like a lot of changes
and not what I expected from the description. I was expecting us just to pass
the node attributes along to yarn but this is much more then that so please
describe in detail how this is working. How does this work exactly with
constraint vs data locality? Is there some sort of timed wait or error
handling if it never becomes available or cluster doesn't have node with that
attribute? You had like 4 examples of things you could do, but are there more,
what dependencies does it need. How does an attribute allow you to get
cardinality or affinity?
is there any constraints on Hadoop version? (I thought it was just
introduced in 3.2.0)
eventually the documentation .md file would need to be updated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]