Repository: flink
Updated Branches:
  refs/heads/release-1.2 8d3ad4515 -> 699f4b05b


[FLINK-5364] [security] Fix documentation setup for Kerberos


Project: http://git-wip-us.apache.org/repos/asf/flink/repo
Commit: http://git-wip-us.apache.org/repos/asf/flink/commit/699f4b05
Tree: http://git-wip-us.apache.org/repos/asf/flink/tree/699f4b05
Diff: http://git-wip-us.apache.org/repos/asf/flink/diff/699f4b05

Branch: refs/heads/release-1.2
Commit: 699f4b05b36de49b6892f0cc1222a5a59179b407
Parents: 00193f7
Author: Stephan Ewen <[email protected]>
Authored: Wed Jan 11 14:32:36 2017 +0100
Committer: Stephan Ewen <[email protected]>
Committed: Wed Jan 11 19:06:10 2017 +0100

----------------------------------------------------------------------
 docs/internals/flink_security.md | 146 ----------------------------------
 docs/ops/security-kerberos.md    | 145 +++++++++++++++++++++++++++++++++
 2 files changed, 145 insertions(+), 146 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/flink/blob/699f4b05/docs/internals/flink_security.md
----------------------------------------------------------------------
diff --git a/docs/internals/flink_security.md b/docs/internals/flink_security.md
deleted file mode 100644
index a83f3b9..0000000
--- a/docs/internals/flink_security.md
+++ /dev/null
@@ -1,146 +0,0 @@
----
-title:  "Flink Security"
-# Top navigation
-top-nav-group: internals
-top-nav-pos: 10
-top-nav-title: Flink Security
----
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-This document briefly describes how Flink security works in the context of 
various deployment mechanisms (Standalone, YARN, or Mesos), 
-filesystems, connectors, and state backends.
-
-## Objective
-The primary goals of the Flink Kerberos security infrastructure are:
-1. to enable secure data access for jobs within a cluster via connectors (e.g. 
Kafka)
-2. to authenticate to ZooKeeper (if configured to use SASL)
-3. to authenticate to Hadoop components (e.g. HDFS, HBase) 
-
-In a production deployment scenario, streaming jobs are understood to run for 
long periods of time (days/weeks/months) and be able to authenticate to secure 
-data sources throughout the life of the job.  Kerberos keytabs do not expire 
in that timeframe, unlike a Hadoop delegation token
-or ticket cache entry.
-
-The current implementation supports running Flink clusters (Job Manager/Task 
Manager/jobs) with either a configured keytab credential
-or with Hadoop delegation tokens.   Keep in mind that all jobs share the 
credential configured for a given cluster.   To use a different keytab
-for for a certain job, simply launch a separate Flink cluster with a different 
configuration.   Numerous Flink clusters may run side-by-side in a YARN
-or Mesos environment.
-
-## How Flink Security works
-In concept, a Flink program may use first- or third-party connectors (Kafka, 
HDFS, Cassandra, Flume, Kinesis etc.) necessitating arbitrary authentication 
methods (Kerberos, SSL/TLS, username/password, etc.).  While satisfying the 
security requirements for all connectors is an ongoing effort,
-Flink provides first-class support for Kerberos authentication only.  The 
following services and connectors are tested for Kerberos authentication:
-
-- Kafka (0.9+)
-- HDFS
-- HBase
-- ZooKeeper
-
-Note that it is possible to enable the use of Kerberos independently for each 
service or connector.  For example, the user may enable 
-Hadoop security without necessitating the use of Kerberos for ZooKeeper, or 
vice versa.    The shared element is the configuration of 
-Kerbreros credentials, which is then explicitly used by each component.
-
-The internal architecture is based on security modules (implementing 
`org.apache.flink.runtime.security.modules.SecurityModule`) which
-are installed at startup.  The next section describes each security module.
-
-### Hadoop Security Module
-This module uses the Hadoop `UserGroupInformation` (UGI) class to establish a 
process-wide *login user* context.   The login user is
-then used for all interactions with Hadoop, including HDFS, HBase, and YARN.
-
-If Hadoop security is enabled (in `core-site.xml`), the login user will have 
whatever Kerberos credential is configured.  Otherwise,
-the login user conveys only the user identity of the OS account that launched 
the cluster.
-
-### JAAS Security Module
-This module provides a dynamic JAAS configuration to the cluster, making 
available the configured Kerberos credential to ZooKeeper,
-Kafka, and other such components that rely on JAAS.
-
-Note that the user may also provide a static JAAS configuration file using the 
mechanisms described in the [Java SE 
Documentation](http://docs.oracle.com/javase/7/docs/technotes/guides/security/jgss/tutorials/LoginConfigFile.html).
   Static entries override any
-dynamic entries provided by this module.
-
-### ZooKeeper Security Module
-This module configures certain process-wide ZooKeeper security-related 
settings, namely the ZooKeeper service name (default: `zookeeper`)
-and the JAAS login context name (default: `Client`).
-
-## Security Configuration
-
-### Flink Configuration
-The user's Kerberos ticket cache (managed with `kinit`) is used automatically, 
based on the following configuration option:
-
-- `security.kerberos.login.use-ticket-cache`: Indicates whether to read from 
the user's Kerberos ticket cache (default: `true`).
-
-A Kerberos keytab can be supplied by adding below configuration elements to 
the Flink configuration file:
-
-- `security.kerberos.login.keytab`: Absolute path to a Kerberos keytab file 
that contains the user credentials.
-
-- `security.kerberos.login.principal`: Kerberos principal name associated with 
the keytab.
-
-These configuration options establish a cluster-wide credential to be used in 
a Hadoop and/or JAAS context.  Whether the credential is used in a Hadoop 
context is based on the Hadoop configuration (see next section).   To be used 
in a JAAS context, the configuration specifies which JAAS *login contexts* (or 
*applications*) are enabled with the following configuration option:
-
-- `security.kerberos.login.contexts`: A comma-separated list of login contexts 
to provide the Kerberos credentials to (for example, `Client` to use the 
credentials for ZooKeeper authentication).
-
-ZooKeeper-related configuration overrides:
-
-- `zookeeper.sasl.service-name`: The Kerberos service name that the ZooKeeper 
cluster is configured to use (default: `zookeeper`). Facilitates 
mutual-authentication between the client (Flink) and server.
-
-- `zookeeper.sasl.login-context-name`: The JAAS login context name that the 
ZooKeeper client uses to request the login context (default: `Client`). Should 
match
-one of the values specified in `security.kerberos.login.contexts`.
-
-### Hadoop Configuration
-
-The Hadoop configuration is located via the `HADOOP_CONF_DIR` environment 
variable and by other means (see 
`org.apache.flink.api.java.hadoop.mapred.utils.HadoopUtils`).   The Kerberos 
credential (configured above) is used automatically if Hadoop security is 
enabled.
-
-Note that Kerberos credentials found in the ticket cache aren't transferrable 
to other hosts.   In this scenario, the Flink CLI acquires Hadoop
-delegation tokens (for HDFS and for HBase).
-
-## Deployment Modes
-Here is some information specific to each deployment mode.
-
-### Standalone Mode
-
-Steps to run a secure Flink cluster in standalone/cluster mode:
-1. Add security-related configuration options to the Flink configuration file 
(on all cluster nodes).
-2. Ensure that the keytab file exists at the path indicated by 
`security.kerberos.login.keytab` on all cluster nodes.
-3. Deploy Flink cluster as normal.
-
-### YARN/Mesos Mode
-
-Steps to run a secure Flink cluster in YARN/Mesos mode:
-1. Add security-related configuration options to the Flink configuration file 
on the client.
-2. Ensure that the keytab file exists at the path as indicated by 
`security.kerberos.login.keytab` on the client node.
-3. Deploy Flink cluster as normal.
-
-In YARN/Mesos mode, the keytab is automatically copied from the client to the 
Flink containers.
-
-For more information, see <a 
href="https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnApplicationSecurity.md";>YARN
 security</a> documentation.
-
-#### Using `kinit` (YARN only)
-
-In YARN mode, it is possible to deploy a secure Flink cluster without a 
keytab, using only the ticket cache (as managed by `kinit`).
-This avoids the complexity of generating a keytab and avoids entrusting the 
cluster manager with it.  The main drawback is
-that the cluster is necessarily short-lived since the generated delegation 
tokens will expire (typically within a week).
-
-Steps to run a secure Flink cluster using `kinit`:
-1. Add security-related configuration options to the Flink configuration file 
on the client.
-2. Login using the `kinit` command.
-3. Deploy Flink cluster as normal.
-
-## Further Details
-### Ticket Renewal
-Each component that uses Kerberos is independently responsible for renewing 
the Kerberos ticket-granting-ticket (TGT).
-Hadoop, ZooKeeper, and Kafka all renew the TGT automatically when provided a 
keytab.  In the delegation token scenario,
-YARN itself renews the token (up to its maximum lifespan).

http://git-wip-us.apache.org/repos/asf/flink/blob/699f4b05/docs/ops/security-kerberos.md
----------------------------------------------------------------------
diff --git a/docs/ops/security-kerberos.md b/docs/ops/security-kerberos.md
new file mode 100644
index 0000000..2afe760
--- /dev/null
+++ b/docs/ops/security-kerberos.md
@@ -0,0 +1,145 @@
+---
+title:  "Kerberos Authentication Setup and Configuration"
+nav-parent_id: setup
+nav-pos: 10
+nav-title: Kerberos
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+This document briefly describes how Flink security works in the context of 
various deployment mechanisms (Standalone, YARN, or Mesos), 
+filesystems, connectors, and state backends.
+
+## Objective
+The primary goals of the Flink Kerberos security infrastructure are:
+1. to enable secure data access for jobs within a cluster via connectors (e.g. 
Kafka)
+2. to authenticate to ZooKeeper (if configured to use SASL)
+3. to authenticate to Hadoop components (e.g. HDFS, HBase) 
+
+In a production deployment scenario, streaming jobs are understood to run for 
long periods of time (days/weeks/months) and be able to authenticate to secure 
+data sources throughout the life of the job.  Kerberos keytabs do not expire 
in that timeframe, unlike a Hadoop delegation token
+or ticket cache entry.
+
+The current implementation supports running Flink clusters (Job Manager/Task 
Manager/jobs) with either a configured keytab credential
+or with Hadoop delegation tokens.   Keep in mind that all jobs share the 
credential configured for a given cluster.   To use a different keytab
+for for a certain job, simply launch a separate Flink cluster with a different 
configuration.   Numerous Flink clusters may run side-by-side in a YARN
+or Mesos environment.
+
+## How Flink Security works
+In concept, a Flink program may use first- or third-party connectors (Kafka, 
HDFS, Cassandra, Flume, Kinesis etc.) necessitating arbitrary authentication 
methods (Kerberos, SSL/TLS, username/password, etc.).  While satisfying the 
security requirements for all connectors is an ongoing effort,
+Flink provides first-class support for Kerberos authentication only.  The 
following services and connectors are tested for Kerberos authentication:
+
+- Kafka (0.9+)
+- HDFS
+- HBase
+- ZooKeeper
+
+Note that it is possible to enable the use of Kerberos independently for each 
service or connector.  For example, the user may enable 
+Hadoop security without necessitating the use of Kerberos for ZooKeeper, or 
vice versa.    The shared element is the configuration of 
+Kerbreros credentials, which is then explicitly used by each component.
+
+The internal architecture is based on security modules (implementing 
`org.apache.flink.runtime.security.modules.SecurityModule`) which
+are installed at startup.  The next section describes each security module.
+
+### Hadoop Security Module
+This module uses the Hadoop `UserGroupInformation` (UGI) class to establish a 
process-wide *login user* context.   The login user is
+then used for all interactions with Hadoop, including HDFS, HBase, and YARN.
+
+If Hadoop security is enabled (in `core-site.xml`), the login user will have 
whatever Kerberos credential is configured.  Otherwise,
+the login user conveys only the user identity of the OS account that launched 
the cluster.
+
+### JAAS Security Module
+This module provides a dynamic JAAS configuration to the cluster, making 
available the configured Kerberos credential to ZooKeeper,
+Kafka, and other such components that rely on JAAS.
+
+Note that the user may also provide a static JAAS configuration file using the 
mechanisms described in the [Java SE 
Documentation](http://docs.oracle.com/javase/7/docs/technotes/guides/security/jgss/tutorials/LoginConfigFile.html).
   Static entries override any
+dynamic entries provided by this module.
+
+### ZooKeeper Security Module
+This module configures certain process-wide ZooKeeper security-related 
settings, namely the ZooKeeper service name (default: `zookeeper`)
+and the JAAS login context name (default: `Client`).
+
+## Security Configuration
+
+### Flink Configuration
+The user's Kerberos ticket cache (managed with `kinit`) is used automatically, 
based on the following configuration option:
+
+- `security.kerberos.login.use-ticket-cache`: Indicates whether to read from 
the user's Kerberos ticket cache (default: `true`).
+
+A Kerberos keytab can be supplied by adding below configuration elements to 
the Flink configuration file:
+
+- `security.kerberos.login.keytab`: Absolute path to a Kerberos keytab file 
that contains the user credentials.
+
+- `security.kerberos.login.principal`: Kerberos principal name associated with 
the keytab.
+
+These configuration options establish a cluster-wide credential to be used in 
a Hadoop and/or JAAS context.  Whether the credential is used in a Hadoop 
context is based on the Hadoop configuration (see next section).   To be used 
in a JAAS context, the configuration specifies which JAAS *login contexts* (or 
*applications*) are enabled with the following configuration option:
+
+- `security.kerberos.login.contexts`: A comma-separated list of login contexts 
to provide the Kerberos credentials to (for example, `Client` to use the 
credentials for ZooKeeper authentication).
+
+ZooKeeper-related configuration overrides:
+
+- `zookeeper.sasl.service-name`: The Kerberos service name that the ZooKeeper 
cluster is configured to use (default: `zookeeper`). Facilitates 
mutual-authentication between the client (Flink) and server.
+
+- `zookeeper.sasl.login-context-name`: The JAAS login context name that the 
ZooKeeper client uses to request the login context (default: `Client`). Should 
match
+one of the values specified in `security.kerberos.login.contexts`.
+
+### Hadoop Configuration
+
+The Hadoop configuration is located via the `HADOOP_CONF_DIR` environment 
variable and by other means (see 
`org.apache.flink.api.java.hadoop.mapred.utils.HadoopUtils`).   The Kerberos 
credential (configured above) is used automatically if Hadoop security is 
enabled.
+
+Note that Kerberos credentials found in the ticket cache aren't transferrable 
to other hosts.   In this scenario, the Flink CLI acquires Hadoop
+delegation tokens (for HDFS and for HBase).
+
+## Deployment Modes
+Here is some information specific to each deployment mode.
+
+### Standalone Mode
+
+Steps to run a secure Flink cluster in standalone/cluster mode:
+1. Add security-related configuration options to the Flink configuration file 
(on all cluster nodes).
+2. Ensure that the keytab file exists at the path indicated by 
`security.kerberos.login.keytab` on all cluster nodes.
+3. Deploy Flink cluster as normal.
+
+### YARN/Mesos Mode
+
+Steps to run a secure Flink cluster in YARN/Mesos mode:
+1. Add security-related configuration options to the Flink configuration file 
on the client.
+2. Ensure that the keytab file exists at the path as indicated by 
`security.kerberos.login.keytab` on the client node.
+3. Deploy Flink cluster as normal.
+
+In YARN/Mesos mode, the keytab is automatically copied from the client to the 
Flink containers.
+
+For more information, see <a 
href="https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnApplicationSecurity.md";>YARN
 security</a> documentation.
+
+#### Using `kinit` (YARN only)
+
+In YARN mode, it is possible to deploy a secure Flink cluster without a 
keytab, using only the ticket cache (as managed by `kinit`).
+This avoids the complexity of generating a keytab and avoids entrusting the 
cluster manager with it.  The main drawback is
+that the cluster is necessarily short-lived since the generated delegation 
tokens will expire (typically within a week).
+
+Steps to run a secure Flink cluster using `kinit`:
+1. Add security-related configuration options to the Flink configuration file 
on the client.
+2. Login using the `kinit` command.
+3. Deploy Flink cluster as normal.
+
+## Further Details
+### Ticket Renewal
+Each component that uses Kerberos is independently responsible for renewing 
the Kerberos ticket-granting-ticket (TGT).
+Hadoop, ZooKeeper, and Kafka all renew the TGT automatically when provided a 
keytab.  In the delegation token scenario,
+YARN itself renews the token (up to its maximum lifespan).

Reply via email to