Kimahriman edited a comment on pull request #35085:
URL: https://github.com/apache/spark/pull/35085#issuecomment-1074222601


   Since it's already gotten a few approvals want to propose a few minor 
changes for the RDD fetching that I could either do here or in a separate PR.
   Current state of things:
   - The shuffle service cannot cleanup cached RDD files in a secure yarn 
environment when RDD fetching from the shuffle service is enabled (pre-existing 
before this PR) 
   - If both RDD fetching and shuffle removal are enabled, RDD fetching won't 
work because the RDD files' permissions are not changed to world readable to 
accommodate the folder permission change
   
   Possible fix:
   - Update the folder permission if either feature is enabled
   - Update `DiskStore` to change newly created files to world readable if RDD 
fetching is enabled
   
   Question: Do all RDD block creations go through that DiskStore code path?
   
   I've also updated my tests locally to test the permission changes on files. 
If manually run with a changed umask (`(umask 0027 && ./build/sbt 
"testOnly...")`) then you can actually verify the permissions changes correctly 
(and it fails properly if you comment out the permission changing)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to