Hi, all!
We have met some similar security requirements and did some investigation on 
security strategies, the third strategy (AM keytab distributed via YARN; AM 
regenerates delegation tokens for containers.) mentioned in YARN security doc 
is already used by Spark1.5+ and we quite agree with that it's necessary to be 
supported in Flink. Moreover, we would like to see the security improvements in 
Flink can be properly applied on other resource management systems like k8s 
etc. (BTW. we have did some work to let Flink application natively run on k8s 
cluster). We are going to do some work on this and hope it can help for finding 
a more generic solution. Thanks!
Tao Yang


------------------------------------------------------------------
发件人:Rong Rong <walter...@gmail.com>
发送时间:2018年12月19日(星期三) 03:06
收件人:dev <dev@flink.apache.org>
主 题:Re: [DISCUSS] Flink Kerberos Improvement

Hi Shuyi,

Yes. I think the impersonation is a very much valid question! This can
actually be considered as 2 questions as I stated in the doc.
1. In the doc I stated that impersonation should be implemented on the
user-side code and should only invoke the cluster client as the actual user
joe'.
2. However, since currently the cluster client assumes no impersonation at
all, many of the code assumes that a fully authorized client can be
instantiated with the same authority that the actual Flink cluster has.
When impersonation is enabled, this might not be the case. For example, if
impersonation is in place, most likely the cluster client running on joe's
behalf will not, and should not have access to keytab file of 'joe'.
Instead, a delegation token is used. Thus the second part of the doc is
trying to address this issue.

--
Rong

On Mon, Dec 17, 2018 at 11:41 PM Shuyi Chen <suez1...@gmail.com> wrote:

> Hi Rong, thanks a lot for the proposal. Currently, Flink assume the keytab
> is located in a remote DFS. Pre-installing Keytabs statically in YARN node
> local filesystem is a common approach, so I think we should support this
> mode in Flink natively. As an optimazation to reduce the KDC access
> frequency, we should also support method 3 (the DT approach) as discussed
> in [1]. A question is that why do we need to implement impersonation in
> Flink? I assume the superuser can do the impersonation for 'joe' and 'joe'
> can then invoke Flink client to deploy the job. Thanks a lot.
>
> Shuyi
>
> [1]
>
> https://docs.google.com/document/d/10V7LiNlUJKeKZ58mkR7oVv1t6BrC6TZi3FGf2Dm6-i8/edit
>
> On Mon, Dec 17, 2018 at 5:49 PM Rong Rong <walter...@gmail.com> wrote:
>
> > Hi All,
> >
> > We have been experimenting integration of Kerberos with Flink in our Corp
> > environment and found out some limitations on the current Flink-Kerberos
> > security mechanism running with Apache YARN.
> >
> > Based on the Hadoop Kerberos security guide [1]. Apparently there are
> only
> > a subset of the suggested long-running service security mechanism is
> > supported in Flink. Furthermore, the current model does not work well
> with
> > superuser impersonating actual users [2] for deployment purposes, which
> is
> > a widely adopted way to launch application in corp environments.
> >
> > We would like to propose an improvement [3] to introduce the other
> comment
> > methods [1] for securing long-running application on YARN and enable
> > impersonation mode. Any comments and suggestions are highly appreciated.
> >
> > Many thanks,
> > Rong
> >
> > [1]
> >
> >
> https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnApplicationSecurity.html#Securing_Long-lived_YARN_Services
> > [2]
> >
> >
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html
> > [3]
> >
> >
> https://docs.google.com/document/d/1rBLCpyQKg6Ld2P0DEgv4VIOMTwv4sitd7h7P5r202IE/edit?usp=sharing
> >
>
>
> --
> "So you have to trust that the dots will somehow connect in your future."
>

Reply via email to