Github user growse commented on a diff in the pull request:
https://github.com/apache/spark/pull/4620#discussion_r24902719
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -686,19 +686,17 @@ private[spark] object Utils extends Logging {
// user has access to them.
getYarnLocalDirs(conf).split(",")
} else {
- // In non-Yarn mode (or for the driver in yarn-client mode), we
cannot trust the user
- // configuration to point to a secure directory. So create a
subdirectory with restricted
- // permissions under each listed directory.
Option(conf.getenv("SPARK_LOCAL_DIRS"))
.getOrElse(conf.get("spark.local.dir",
System.getProperty("java.io.tmpdir")))
.split(",")
.flatMap { root =>
try {
val rootDir = new File(root)
- if (rootDir.exists || rootDir.mkdirs()) {
- val dir = createDirectory(root)
- chmod700(dir)
- Some(dir.getAbsolutePath)
+ if (rootDir.exists) {
+ Some(rootDir.getAbsolutePath)
+ } else if (rootDir.mkdirs()) {
+ chmod700(rootDir)
--- End diff --
The default value of `spark.local.dir` is `/tmp`, which should exist on all
linux systems at least. If I've read the patch right, that means that
`getOrCreateLocalRootDirs` will simply return `/tmp`, which is usually readable
by everyone. Presumably, the spark worker will then just write a load of stuff
into `/tmp`. It seems that the whole purpose of the previous behaviour was to
create an isolated directory inside `spark.local.dir`, because it was assumed
that this value was a shared directory.
Is this not desirable behaviour, or have I missed something?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]