Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19130 Hi @cloud-fan , the main purpose of `spark.yarn.dist.forceDownloadSchemes` is to explicitly using Spark's own logic to handle remote resources instead of relying on Hadoop. For example if `spark.yarn.dist.forceDownloadSchemes` is configured to `http,https`, then this 2 kinds of resources will be downloaded by Spark prior to add to dist cache, even if they're supported by http FS in Hadoop 2.9+. For now if we use Hadoop 2.9-, since Hadoop doesn't support http FS, so we will always leverage Spark's own logic to download resources, it is not necessary to configure this parameter.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org