[ https://issues.apache.org/jira/browse/FLINK-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15395556#comment-15395556 ]
ASF GitHub Bot commented on FLINK-3929: --------------------------------------- Github user mxm commented on a diff in the pull request: https://github.com/apache/flink/pull/2275#discussion_r72428903 --- Diff: docs/internals/flink_security.md --- @@ -0,0 +1,87 @@ +--- +title: "Flink Security" +# Top navigation +top-nav-group: internals +top-nav-pos: 10 +top-nav-title: Flink Security +--- +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +This document briefly describes how Flink security works in the context of various deployment mechanism (Standalone/Cluster vs YARN) +and the connectors that participates in Flink Job execution stage. This documentation can be helpful for both administrators and developers +who plans to run Flink on a secure environment. + +## Objective + +The primary goal of Flink security model is to enable secure data access for jobs within a cluster via connectors. In production deployment scenario, +streaming jobs are understood to run for longer period of time (days/weeks/months) and the system must be able to authenticate against secure +data sources throughout the life of the job. The current implementation supports running Flink cluster (Job Manager/Task Manager/Jobs) under the +context of a Kerberos identity based on Keytab credential supplied during deployment time. Any jobs submitted will continue to run in the identity of the cluster. + +## How Flink Security works +Flink deployment includes running Job Manager/ZooKeeper, Task Manager(s), Web UI and Job(s). Jobs (user code) can be submitted through web UI and/or CLI. +A Job program may use one or more connectors (Kafka, HDFS, Cassandra, Flume, Kinesis etc.,) and each connector may have a specific security +requirements (Kerberos, database based, SSL/TLS, custom etc.,). While satisfying the security requirements for all the connectors evolve over a period +of time but at this time of writing, the following connectors/services are tested for Kerberos/Keytab based security. + +- Kafka (0.9) +- HDFS +- ZooKeeper + +Hadoop uses UserGroupInformation (UGI) class to manage security. UGI is a static implementation that takes care of handling Kerberos authentication. Flink bootstrap implementation +(JM/TM/CLI) takes care of instantiating UGI with appropriate security credentials to establish necessary security context. + +Services like Kafka and ZooKeeper uses SASL/JAAS based authentication mechanism to authenticate against a Kerberos server. It expects JAAS configuration with platform-specific login --- End diff -- with *a* platform-specific login > Support for Kerberos Authentication with Keytab Credential > ---------------------------------------------------------- > > Key: FLINK-3929 > URL: https://issues.apache.org/jira/browse/FLINK-3929 > Project: Flink > Issue Type: New Feature > Reporter: Eron Wright > Assignee: Vijay Srinivasaraghavan > Labels: kerberos, security > Original Estimate: 672h > Remaining Estimate: 672h > > _This issue is part of a series of improvements detailed in the [Secure Data > Access|https://docs.google.com/document/d/1-GQB6uVOyoaXGwtqwqLV8BHDxWiMO2WnVzBoJ8oPaAs/edit?usp=sharing] > design doc._ > Add support for a keytab credential to be associated with the Flink cluster, > to facilitate: > - Kerberos-authenticated data access for connectors > - Kerberos-authenticated ZooKeeper access > Support both the standalone and YARN deployment modes. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)