maryannxue commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task 
location
URL: https://github.com/apache/spark/pull/26633#issuecomment-558682192
 
 
   Thanks for the feedback, @tgravescs! This is a workaround. A complete 
solution would be bring the current locality fallback to task level instead 
instead of node level, as I said in the previous comment. An RDD knows the 
importance of locality based on its job and/or data size and sets a wait time 
for itself. Setting the cluster/node level wait time would definitely affect 
other workloads and is not a solution here.
   
   I could add the usage by LocalShuffledRowRDD here in this PR but was 
thinking it was a Spark SQL change and better be put in a separate PR. I'm fine 
either way, it's one line of code change plus some code comments.
   Back to its usage. Normally a LocalShuffledRowRDD can just read locally, 
using its map output locations. And we try to match its parallelism with the 
original ShuffledRowRDD, so we get the same parallel level and better locality. 
But the problem is when we have less mappers (from the shuffle map stage) than 
the number of worker nodes, e.g., 5 vs. 10, and if we stick to the preferred 
locations, the LocalShuffledRowRDD will suffer from locality wait and be even 
slower than the original ShuffledRowRDD. If we had a way to know we'd end up 
with this situation, we could back out of the LocalShuffledRowRDD optimization 
and fall back to the regular shuffle. But the number of nodes is dynamic and we 
can't decide at compile-time.
   The WILDCARD location can make it totally adaptive in this sense. Given that 
Spark offers resources always from high locality level to low locality level, a 
node can never be "stolen" (for a specific task set) as long as it has 
unassigned tasks of higher locality match. So when the number of mappers is 
equal to or larger than the number of worker nodes, even with WILDCARD location 
specified, the read can be near completely local. Otherwise, if the number of 
mappers is smaller, you can get roughly "number_of_mappers/number_of_nodes" 
locality hit. In the worst case, if the number of mappers is significantly 
smaller, the LocalShuffledRowRDD will regress to a regular ShuffledRowReader 
but cannot be worse.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to