Kimahriman commented on a change in pull request #35085:
URL: https://github.com/apache/spark/pull/35085#discussion_r796805778
##########
File path: core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala
##########
@@ -94,7 +95,13 @@ private[spark] class DiskBlockManager(
} else {
val newDir = new File(localDirs(dirId), "%02x".format(subDirId))
if (!newDir.exists()) {
- Files.createDirectory(newDir.toPath)
+ // SPARK-37618: Create dir as group writable so files within can be
deleted by the
+ // shuffle service
+ val path = newDir.toPath
+ Files.createDirectory(path)
+ val currentPerms = Files.getPosixFilePermissions(path)
+ currentPerms.add(PosixFilePermission.GROUP_WRITE)
Review comment:
The problem I'm talking about is losing the setgid bit when creating a
folder with 770 permissions. This problem is the same whether you use mkdir,
Files.setPosixPermissions, or FileSystem.setPermissions. A lot of this is
probably somewhat specific to running a secure yarn setup:
- Node managers run as `yarn:hadoop` and create all container directories as
`<user>:hadoop` with the setgid bit and umask `0027`
- Applications run as the running user as a local Linux user
- These users are _not_ in the `hadoop` group (the superuser group)
Running the spark standalone probably doesn't need all the permissions
changes assuming everything is just running as the same user? I don't know
where tez fits into this (does it run processes as a local Linux user?)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]