Github user mridulm commented on the issue:
https://github.com/apache/spark/pull/15218
@zhzhan I am curious why this is the case for the jobs being mentioned.
This pr should have an impact if the locality preference of the taskset
being run is fairly suboptimal to begin with, no ?
If the tasks have PROCESS_LOCAL or NODE_LOCAL locality preference - that
will take precedence, and attempts to spread the load or reduce spread to nodes
as envisioned here will not work.
So the target here seems to be RACK_LOCAL or ANY locality preference -
which should be fairly uncommon; unless I am missing something here w.r.t the
jobs being run.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]