GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/19934
[SPARK-3685][CORE] Prints explicit warnings when configured local
directories are set to URIs
## What changes were proposed in this pull request?
This PR proposes to print warnings before creating local by `java.io.File`.
I think we can't just simply disallow and throw an exception for such cases
of `hdfs:/tmp/foo` case because it might break compatibility. Note that
`hdfs:/tmp/foo` creates a directory called `hdfs:/`.
There were many discussion here about whether we should support this in
other file systems or now; however, since the JIRA targets "Spark's local dir
should accept only local paths", here, I tried to simply print warnings.
I think we could open another JIRA and design doc if this is something we
should support, separately.
**Before**
```
./bin/spark-shell --conf spark.local.dir=file:/a/b/c
```
This creates a local directory as below:
```
file:/
âââ a
âââ b
âââ c
...
```
**After**
```bash
./bin/spark-shell --conf spark.local.dir=file:/a/b/c
```
Now, it prints a warning as below:
```
...
17/12/09 21:58:49 WARN Utils: The configured local directories are not
expected to be URIs; however, got suspicious values [file:/a/b/c]. Please check
your configured local directories.
...
```
```bash
./bin/spark-shell --conf spark.local.dir=file:/a/b/c,/tmp/a/b/c,hdfs:/a/b/c
```
It also works with comma-separated ones:
```
...
17/12/09 22:05:01 WARN Utils: The configured local directories are not
expected to be URIs; however, got suspicious values [file:/a/b/c, hdfs:/a/b/c].
Please check your configured local directories.
...
```
## How was this patch tested?
Manually tested:
```
./bin/spark-shell --conf spark.local.dir=C:\\a\\b\\c
./bin/spark-shell --conf spark.local.dir=/tmp/a/b/c
./bin/spark-shell --conf spark.local.dir=a/b/c
./bin/spark-shell --conf spark.local.dir=a/b/c,/tmp/a/b/c,C:\\a\\b\\c
```
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/HyukjinKwon/spark SPARK-3685
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19934.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19934
----
commit 0db4bf1c1b447ce39f790d7c81fc3bb2619e156a
Author: hyukjinkwon <[email protected]>
Date: 2017-12-09T13:10:00Z
Prints explicit warnings when configured local directories are set to URIs
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]