maryannxue commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task location URL: https://github.com/apache/spark/pull/26633#issuecomment-558682192 Thanks for the feedback, @tgravescs! This is a workaround. A complete solution would be bring the current locality fallback to task level instead instead of node level, as I said in the previous comment. An RDD knows the importance of locality based on its job and/or data size and sets a wait time for itself. Setting the cluster/node level wait time would definitely affect other workloads and is not a solution here. I could add the usage by LocalShuffledRowRDD here in this PR but was thinking it was a Spark SQL change and better be put in a separate PR. I'm fine either way, it's one line of code change plus some code comments. Back to its usage. Normally a LocalShuffledRowRDD can just read locally, using its map output locations. And we try to match its parallelism with the original ShuffledRowRDD, so we get the same parallel level and better locality. But the problem is when we have less mappers (from the shuffle map stage) than the number of worker nodes, e.g., 5 vs. 10, and if we stick to the preferred locations, the LocalShuffledRowRDD will suffer from locality wait and be even slower than the original ShuffledRowRDD. If we had a way to know we'd end up with this situation, we could back out of the LocalShuffledRowRDD optimization and fall back to the regular shuffle. But the number of nodes is dynamic and we can't decide at compile-time. The WILDCARD location can make it totally adaptive in this sense. Given that Spark offers resources always from high locality level to low locality level, a node can never be "stolen" (for a specific task set) as long as it has unassigned tasks of higher locality match. So when the number of mappers is equal to or larger than the number of worker nodes, even with WILDCARD location specified, the read can be near completely local. Otherwise, if the number of mappers is smaller, you can get roughly "number_of_mappers/number_of_nodes" locality hit. In the worst case, if the number of mappers is significantly smaller, the LocalShuffledRowRDD will regress to a regular ShuffledRowReader but cannot be worse.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
