Github user EronWright commented on the issue:
https://github.com/apache/flink/pull/3335
When working on FLINK-3932, I came to the conclusion that the state backend
data should probably be written into the Hadoop user's home directory, since
most Hadoop setups protect the home
Github user EronWright commented on the issue:
https://github.com/apache/flink/pull/3335
@WangTaoTheTonic can you elaborate on the multi-user scenario that you have
in mind? Keep in mind that a given Flink cluster doesn't provide any isolation
between jobs in that cluster. So it
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3335
@greghogan I think it's more like an improvement rather than a new feature.
Anyway I'll post to mailling list for discussion.
Thanks all guys :)
---
If your project is set up for
Github user greghogan commented on the issue:
https://github.com/apache/flink/pull/3335
@WangTaoTheTonic there is no reason for Flink to support a feature which
already has proper implementations in a lower layer. Every configuration option
adds to users' cognitive load and increases
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3335
As sub dirs are created by different jobs/users under root directory, we
keep it minimum(or configurable) at creation in order to keep the data safe.
When a user has needs of
Github user greghogan commented on the issue:
https://github.com/apache/flink/pull/3335
@WangTaoTheTonic How are you proposing to solve the handling of arbitrarily
complex permissions?
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3335
@greghogan I'm aware of that, but my concern is when lots of users store
their checkpoint files under same root directory, it would be a burden for
admin to set different ACLs for different
Github user greghogan commented on the issue:
https://github.com/apache/flink/pull/3335
@WangTaoTheTonic, ACLs combine with the standard file permissions
(`user-group-other`). Only one ACL is necessary to implement this PR. A second
ACL would allow, for example, a `flink_admin` group
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3335
We are very close in scenario :)
My point is that multiple users would use same root directory to store
their checkpoint files(creating single directory for each job is complex),
Github user StephanEwen commented on the issue:
https://github.com/apache/flink/pull/3335
@WangTaoTheTonic Am I right in assuming that your scenario assumes that
multiple different users submit Flink jobs and these jobs cannot be "prepared"
by a script that sets up a dedicated
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3335
I've read all guys and list preconditions and solutions for this directory
permission setting.
## Preconditions
1. Every flink job(session or single) can specify a directory
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3335
I agree with @StephanEwen that people probably manage the directory
permissions directly when configuring the Flink jobs. It would be quite
annoying if the Flink job changed the permissions you set
Github user greghogan commented on the issue:
https://github.com/apache/flink/pull/3335
The HDFS administrator can configure the parent directory for checkpoints
with user and/or group ACL permissions. A default ACL is then inherited by the
newly created files and subdirectories
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3335
Hi @greghogan , I'm not sure I understand the relationship between HDFS
ACLs and this change I proposed. Could you explain more specifically? Thanks.
---
If your project is set up for it,
Github user greghogan commented on the issue:
https://github.com/apache/flink/pull/3335
@WangTaoTheTonic HDFS supports posix ACLs
([link](http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html#ACLs_Access_Control_Lists)).
These are
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3335
Hi Stephan,
You may have a little misunderstanding about this change. It only controls
directories with job id (generated using UUID), but not the configured root
checkpoint
Github user StephanEwen commented on the issue:
https://github.com/apache/flink/pull/3335
Thank you for the contribution. I see the idea behind the fix.
I am unsure whether we should let Flink manage the permissions of these
directories. My gut feeling is that this is
17 matches
Mail list logo