Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3335
I've read all guys and list preconditions and solutions for this directory
permission setting.
## Preconditions
1. Every flink job(session or single) can specify a directory storing
checkpoint, called `state.backend.fs.checkpointdir`.
2. Different jobs can set same or different directories, which means their
checkpoint files can be stored in one same or different directories, with
**sub-dir** created with their own job-ids.
3. Jobs can be run by different users, and users has requirement that one
could not read chp files written by another user, which will cause information
leak.
4. In some condition(which is relatively rare, I think), as @StephanEwen
said, users has need to access other usersâ chp files for cloning/migrating
jobs.
5. The chp files path is like:
`hdfs://namenode:port/flink-checkpoints/<job-id>/chk-17/6ba7b810-9dad-11d1-80b4-00c04fd430c8`
## Solutions
### Solution #1 (would not require changes)
1. Admins control permission of root directory via HDFS ACLs(set it like:
user1 can read&write, user2 can only read, â¦).
2. This has two disadvantages: a) It is a huge burden for Admins to set
different permissions for large number of users/groups); and b) sub-dirs
inherited permissions from root directory, which means they are basically same,
which make it hard to do fine grained control.
### Solution #2 (this proposal)
1. We donât care what permission of the root dir is. It can be create
while setup or job running, as long as it is available to use.
2. We control every sub-dir created by different jobs(which are submitted
by different users, in most cases), and set it to a lower value(like â700â)
to prevent it to be read by others.
3. If someone wanna migrate or clone jobs across users(again, this scenario
is rare in my view), he should ask admins(normally HDFS admin) to add ACLs(or
whatever) for this purpose.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---