GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/19934
[SPARK-3685][CORE] Prints explicit warnings when configured local directories are set to URIs ## What changes were proposed in this pull request? This PR proposes to print warnings before creating local by `java.io.File`. I think we can't just simply disallow and throw an exception for such cases of `hdfs:/tmp/foo` case because it might break compatibility. Note that `hdfs:/tmp/foo` creates a directory called `hdfs:/`. There were many discussion here about whether we should support this in other file systems or now; however, since the JIRA targets "Spark's local dir should accept only local paths", here, I tried to simply print warnings. I think we could open another JIRA and design doc if this is something we should support, separately. **Before** ``` ./bin/spark-shell --conf spark.local.dir=file:/a/b/c ``` This creates a local directory as below: ``` file:/ âââ a âââ b âââ c ... ``` **After** ```bash ./bin/spark-shell --conf spark.local.dir=file:/a/b/c ``` Now, it prints a warning as below: ``` ... 17/12/09 21:58:49 WARN Utils: The configured local directories are not expected to be URIs; however, got suspicious values [file:/a/b/c]. Please check your configured local directories. ... ``` ```bash ./bin/spark-shell --conf spark.local.dir=file:/a/b/c,/tmp/a/b/c,hdfs:/a/b/c ``` It also works with comma-separated ones: ``` ... 17/12/09 22:05:01 WARN Utils: The configured local directories are not expected to be URIs; however, got suspicious values [file:/a/b/c, hdfs:/a/b/c]. Please check your configured local directories. ... ``` ## How was this patch tested? Manually tested: ``` ./bin/spark-shell --conf spark.local.dir=C:\\a\\b\\c ./bin/spark-shell --conf spark.local.dir=/tmp/a/b/c ./bin/spark-shell --conf spark.local.dir=a/b/c ./bin/spark-shell --conf spark.local.dir=a/b/c,/tmp/a/b/c,C:\\a\\b\\c ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/HyukjinKwon/spark SPARK-3685 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19934.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19934 ---- commit 0db4bf1c1b447ce39f790d7c81fc3bb2619e156a Author: hyukjinkwon <gurwls...@gmail.com> Date: 2017-12-09T13:10:00Z Prints explicit warnings when configured local directories are set to URIs ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org