Hello everybody, I'm running some tests on how Flink as a long-running YARN session handles security with Kerberos. In particular, I'm running a test where I run Flink on YARN with a service account and then deploy a job via CLI as another user; in the job I'm trying to access a private folder of the former on HDFS but the job fails due to permission issues (the user running the job is actually the one who ran Flink on YARN in the first place — the service account).
I'm running Flink 1.0.0-RC5, launching the long-running session with: bin/yarn-session.sh -n 2 -tm 4096 -s 3 and then running the following command: bin/flink run examples/batch/WordCount.jar \ --input hdfs:///user/stefano.baghino/hamlet.txt \ --output hdfs:///user/stefano.baghino/hamlet.out Here are the logs: https://gist.github.com/stefanobaghino/6605ec33a1c4b632fb78 It looks like the YARN session is acting as a proxy for the user instead of receiving a delegation. Is there a way to change this behavior? Is this by design? Is there an interest in implementing the delegation (if it's not already implemented)? Otherwise, is there a workaround, apart from running one-off jobs on YARN? Thank you so much in advance. -- BR, Stefano Baghino Software Engineer @ Radicalbit