cloud-fan commented on issue #26633: [SPARK-29994][CORE] Add WILDCARD task location URL: https://github.com/apache/spark/pull/26633#issuecomment-559141731 It is the same problem as https://issues.apache.org/jira/browse/SPARK-18886 . It would be great if we can solve that problem first, but seems there is no conclusion yet. There is one difference in `LocalShuffledRowRDD`: it has a baseline. It's converted from the normal shuffle reader, and we shouldn't be slower than it. The normal shuffle reader mostly doesn't have locality (a reducer needs to read blocks from many hosts), so the WILDCARD location is a good solution. It kinds of turn off the locality wait for `LocalShuffledRowRDD`, to make it not slower than normal shuffle reader. Do we have an ETA about when we can resolve https://issues.apache.org/jira/browse/SPARK-18886 ? We can't remove locality wait as nowadays we usually run many jobs on a Spark cluster. It's unclear to me what's the best solution to it. BTW, this feature won't be documented and it's not that public to me. Users can only know it by reading the discussion here. We can still remove it later if https://issues.apache.org/jira/browse/SPARK-18886 is resolved. To me this is just a workaround to turn off delay scheduling for certain tasks instead of globally, which does have value.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org