knaufk commented on a change in pull request #18746: URL: https://github.com/apache/flink/pull/18746#discussion_r808778983
########## File path: docs/content/docs/deployment/security/overview.md ########## @@ -0,0 +1,68 @@ +--- +title: "Overview" +weight: 1 +type: docs +--- +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +# Security Overview + +Frameworks that process data are sensitive components; you must use authentication and encryption to +secure your data and data sources. Apache Flink supports authentication with [Kerberos](https://web.mit.edu/kerberos/) +and can be configured to encrypt all network communication with [SSL](https://www.ssl.com/faqs/faq-what-is-ssl/). + +When we talk about security for Flink, we generally make a distinction between securing the internal +communication within the Flink cluster (i.e. between the Task Managers, between the Task Managers and +the Flink Master) and securing the external communication between the cluster and the outside world. Review comment: We don't use the term "Flink Master" anymore. It is just called "Jobmanager". ########## File path: docs/content/docs/deployment/security/overview.md ########## @@ -0,0 +1,68 @@ +--- +title: "Overview" +weight: 1 +type: docs +--- +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +# Security Overview Review comment: I think, Security should be a top-level section sibling of "Deployment". ########## File path: docs/content/docs/deployment/security/kerberos.md ########## @@ -0,0 +1,116 @@ +--- +title: Authentication with Kerberos +weight: 2 +type: docs +aliases: + - /deployment/security/kerberos.html + - /ops/security-kerberos.html +--- +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +# Enabling and Configuring Authentication with Kerberos + +## What is Kerberos? + +[Kerberos](https://web.mit.edu/kerberos/) is a network authentication protocol that provides a secure, +single-sign-on, trusted, third-party mutual authentication service. It is designed to provide strong +authentication for client/server applications by using secret-key cryptography. + +## How the Flink Security Infrastructure works with Kerberos + +A Flink program may use first- or third-party connectors, necessitating arbitrary authentication methods +(Kerberos, SSL/TLS, username/password, etc.). While satisfying the security requirements for all connectors +is an ongoing effort, Flink provides first-class support for Kerberos authentication only. Review comment: Flink only has first-class support for Kerberos? Why do you think so? ########## File path: docs/content/docs/deployment/security/running-cluster.md ########## @@ -0,0 +1,292 @@ +--- +title: Incorporating Security Features in a Running Cluster +weight: 4 +type: docs +aliases: +--- +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +# Incorporating Security Features in a Running Cluster + +This guide describes how Flink security works in the context of various [deployment modes]({{< ref "docs/deployment/overview" >}}), Review comment: This page seems to focus on different resource providers not deployment modes, right? ########## File path: docs/content/docs/deployment/security/overview.md ########## @@ -0,0 +1,68 @@ +--- +title: "Overview" +weight: 1 +type: docs +--- +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +# Security Overview + +Frameworks that process data are sensitive components; you must use authentication and encryption to +secure your data and data sources. Apache Flink supports authentication with [Kerberos](https://web.mit.edu/kerberos/) +and can be configured to encrypt all network communication with [SSL](https://www.ssl.com/faqs/faq-what-is-ssl/). + +When we talk about security for Flink, we generally make a distinction between securing the internal +communication within the Flink cluster (i.e. between the Task Managers, between the Task Managers and +the Flink Master) and securing the external communication between the cluster and the outside world. + +Internally, netty is used for the TCP connections used for data exchange among the task managers, +and Akka is used for RPC between the Flink master and the task managers. + +Externally, HTTP is used for pretty much everything, except that some external services used as sources +or sinks may use some other network protocol. + +## What is supported? + +Security enhancement features by the Flink community make it easy to access secured data, protect +associated credentials, and increase overall security in a Flink cluster. The following security +measures are currently supported: + +- Authentication of connections between Flink processes +- Encryption of data transferred between Flink processes using SSL (Note that there is a performance + degradation when SSL is enabled, the magnitude of which depends on the CPU type and the JVM implementation.) +- Authorization of read / write operations by clients +- Authorization is pluggable and integration with external authorization services is supported + +It is worth noting that security is optional because the overall philosophy in Flink is to have defaults Review comment: "Security is optional" is our "philosophy" does not sound good. ;) ########## File path: docs/content/docs/deployment/security/overview.md ########## @@ -0,0 +1,68 @@ +--- +title: "Overview" +weight: 1 +type: docs +--- +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +# Security Overview + +Frameworks that process data are sensitive components; you must use authentication and encryption to +secure your data and data sources. Apache Flink supports authentication with [Kerberos](https://web.mit.edu/kerberos/) +and can be configured to encrypt all network communication with [SSL](https://www.ssl.com/faqs/faq-what-is-ssl/). + +When we talk about security for Flink, we generally make a distinction between securing the internal +communication within the Flink cluster (i.e. between the Task Managers, between the Task Managers and Review comment: Taskmanager is usually one word. ########## File path: docs/content/docs/deployment/security/ssl.md ########## @@ -0,0 +1,243 @@ +--- +title: "Encryption and Authentication using SSL" +weight: 3 +type: docs +aliases: + - /deployment/security/ssl.html + - /ops/security-ssl.html +--- +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +# Encryption and Authentication using SSL + +Flink supports mutual authentication (when two parties authenticate each other at the same time) and +encryption of network communication with SSL for internal and external communication. + +**By default, SSL/TLS authentication and encryption is not enabled** (to have defaults work out-of-the-box). + +This guide will explain internal vs external connectivity, and provide instructions on how to enable +SSL/TLS authentication and encryption for network communication with and between Flink processes. We +will go through steps such as generating certificates, setting up TrustStores and KeyStores, and +configuring cipher suites. + +For how-tos and tips for different deployment environments (i.e. standalone clusters, Kubernetes, YARN), +check out the section on [Incorporating Security Features in a Running Cluster](#). + +## Internal and External Communication + +There are two types of network connections to authenticate and encrypt: internal and external. + +{{< img src="/fig/ssl_internal_external.svg" alt="Internal and External Connectivity" width=75% >}} + +For more flexibility, security for internal and external connectivity can be enabled and configured +separately. + +### Internal Connectivity + +Flink internal communication refers to all connections made between Flink processes. These include: + +- Control messages: RPC between JobManager / TaskManager / Dispatcher / ResourceManager +- Transfers on the data plane: connections between TaskManagers to exchange data during shuffles, + broadcasts, redistribution, etc +- Blob service communication: distribution of libraries and other artifacts + +All internal connections are SSL authenticated and encrypted. The connections use **mutual authentication**, +meaning both server and client side of each connection need to present the certificate to each other. +The certificate acts as a shared secret and can be embedded into container images or attached to your +deployment setup. These connections run Flink custom protocols. Users never connect directly to internal +connectivity endpoints. + +### External Connectivity + +Flink external communication refers to all connections made from the outside to Flink processes. +This includes: +- communication with the Dispatcher to submit Flink jobs (session clusters) +- communication of the Flink CLI with the JobManager to inspect and modify a running Flink job/application + +Most of these connections are exposed via REST/HTTP endpoints (and used by the web UI). Some external Review comment: Are there any connections that don't use these endpoints? ########## File path: docs/content/docs/deployment/security/running-cluster.md ########## @@ -0,0 +1,292 @@ +--- +title: Incorporating Security Features in a Running Cluster +weight: 4 +type: docs +aliases: +--- +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +# Incorporating Security Features in a Running Cluster + +This guide describes how Flink security works in the context of various [deployment modes]({{< ref "docs/deployment/overview" >}}), Review comment: I am wondering if the the resource provider specifics should actually go with the resource providers and we only reference there and back. :thinking: ########## File path: docs/content/docs/deployment/security/ssl.md ########## @@ -0,0 +1,243 @@ +--- +title: "Encryption and Authentication using SSL" +weight: 3 +type: docs +aliases: + - /deployment/security/ssl.html + - /ops/security-ssl.html +--- +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +# Encryption and Authentication using SSL + +Flink supports mutual authentication (when two parties authenticate each other at the same time) and +encryption of network communication with SSL for internal and external communication. + +**By default, SSL/TLS authentication and encryption is not enabled** (to have defaults work out-of-the-box). + +This guide will explain internal vs external connectivity, and provide instructions on how to enable +SSL/TLS authentication and encryption for network communication with and between Flink processes. We +will go through steps such as generating certificates, setting up TrustStores and KeyStores, and +configuring cipher suites. + +For how-tos and tips for different deployment environments (i.e. standalone clusters, Kubernetes, YARN), +check out the section on [Incorporating Security Features in a Running Cluster](#). + +## Internal and External Communication + +There are two types of network connections to authenticate and encrypt: internal and external. + +{{< img src="/fig/ssl_internal_external.svg" alt="Internal and External Connectivity" width=75% >}} + +For more flexibility, security for internal and external connectivity can be enabled and configured +separately. + +### Internal Connectivity + +Flink internal communication refers to all connections made between Flink processes. These include: + +- Control messages: RPC between JobManager / TaskManager / Dispatcher / ResourceManager +- Transfers on the data plane: connections between TaskManagers to exchange data during shuffles, + broadcasts, redistribution, etc +- Blob service communication: distribution of libraries and other artifacts + +All internal connections are SSL authenticated and encrypted. The connections use **mutual authentication**, +meaning both server and client side of each connection need to present the certificate to each other. +The certificate acts as a shared secret and can be embedded into container images or attached to your +deployment setup. These connections run Flink custom protocols. Users never connect directly to internal +connectivity endpoints. + +### External Connectivity + +Flink external communication refers to all connections made from the outside to Flink processes. +This includes: +- communication with the Dispatcher to submit Flink jobs (session clusters) +- communication of the Flink CLI with the JobManager to inspect and modify a running Flink job/application + +Most of these connections are exposed via REST/HTTP endpoints (and used by the web UI). Some external +services used as sources or sinks may use some other network protocol. + +The server will, by default, accept connections from any client, meaning that the REST endpoint does +not authenticate the client. These REST endpoints, however, can be configured to require SSL encryption +and mutual authentication. + +However, the recommended approach is setting up and configuring a dedicated proxy service (a "sidecar +proxy") that controls access to the REST endpoint. This involves binding the REST endpoint to the +loopback interface (or the pod-local interface in Kubernetes) and starting a REST proxy that authenticates +and forwards the requests to Flink. Examples for proxies that Flink users have deployed are [Envoy Proxy](https://www.envoyproxy.io/) +or [NGINX with MOD_AUTH](http://nginx.org/en/docs/http/ngx_http_auth_request_module.html). + +The rationale behind delegating authentication to a proxy is that such proxies offer a wide variety +of authentication options and thus better integration into existing infrastructures. + +## Queryable State + +Connections to the [queryable state]({{< ref "docs/dev/datastream/fault-tolerance/queryable_state" >}}) +endpoints is currently not authenticated or encrypted. + +## SSL Setups + +{{< img src="/fig/ssl_mutual_auth.svg" alt="SSL Mutual Authentication" width=75% >}} + +Each participant has a keystore and a truststore, which are files. + +A keystore contains a certificate (which contains a public key) and a private key. A truststore +contains trusted certificates and certificate chains/authorities. + +Establishing encrypted, authenticated communication is a multi-step process, shown in the figure. +Certificates are exchanged and validated against the truststore, after which the two parties can +safely communicate. + +### Typical SSL Setup in Flink + +For mutually authenticated internal connections, note that: + +- a keystore and a truststore can contain the same dedicated certificate +- the same file can be used for both keystore and truststore +- wildcard hostnames or addresses can be used + +For internal communication between servers in a Flink cluster, a secure setup can be established with +a single, self-signed certificate that all parties use as both their keystore and truststore. You can +also use this approach for external communication when establishing mutual authentication for communication +between clients and the Flink Master. + +### Configuring Keystores and Truststores + +The SSL configuration requires configuring a keystore and a truststore such that the truststore trusts +the keystore's certificate. + +You can use the [keytool utility](https://docs.oracle.com/javase/8/docs/technotes/tools/unix/keytool.html) +to generate keys, certificates, keystores, and truststores: + +```bash + keytool -genkeypair -alias flink.internal -keystore internal.keystore \ + -dname "CN=flink.internal" -storepass internal_store_password -keyalg RSA \ + -keysize 4096 -storetype PKCS12 +``` + +| Deployment mode | How to add the files | +|------------------------|-------------------------------------------------------------------------| +| Standalone clusters | copy the files to each node, or add them to a shared mounted filesystem | +| Containerized clusters | add the files to the container images | +| YARN | the cluster deployment phase can distribute these files | + +### Using Cipher Suites + +While the acts of encryption and decryption themselves are performed by keys, cipher suites outline +the set of steps that the keys must follow to do so and the order in which these steps are executed. +There are numerous cipher suites out there, each one with varying instructions on the encryption and +decryption process. + +{{< hint warning >}} +The [IETF RFC 7525](https://tools.ietf.org/html/rfc7525) recommends using a specific set of cipher +suites for strong security. Since these cipher suites are not available on many setups out-of-the-box, +Flink defaults to TLS_RSA_WITH_AES_128_CBC_SHA (a slightly weaker but more widely available cipher suite). + +If stronger encryption is available in your environment, we recommend that you update your SSL setup +to the stronger cipher suites by adding the below entry to the Flink configuration file (`flink-conf.yaml`): + +```yaml +security.ssl.algorithms: TLS_DHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_DHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 +``` + +If these cipher suites are not supported in your setup, you will see that Flink processes will not +be able to connect to each other. +{{< /hint >}} + +### Configuring SSL for Internal Connectivity + +The following setting in `flink-conf.yaml` is used to enable SSL for all internal connections: + +```yaml +security.ssl.internal.enabled: true +``` + +{{< hint info >}} +For backwards compatibility, the **security.ssl.enabled** option still exists and enables SSL +for both internal and external/REST endpoints. +{{< /hint >}} + +You can disable security for different connection types. When `security.ssl.internal.enabled` is set +to `true`, you can set the following parameters to `false` to disable SSL for that particular connection +type: + +- `taskmanager.data.ssl.enabled` → Data communication between TaskManagers +- `blob.service.ssl.enabled` → Transport of BLOBs from JobManager to TaskManager +- `akka.ssl.enabled` → Akka-based RPC connections between JobManager / TaskManager / ResourceManager + +Because internal communication is mutually authenticated between the server and the client, keystore +and truststore typically refer to a dedicated certificate that acts as a shared secret. In such a setup, +the certificate can use wildcard hostnames or addresses. When using self-signed certificates, it is +even possible to use the same file as keystore and truststore. + +Take note of the following configuration settings: + +```yaml +security.ssl.internal.keystore: /path/to/file.keystore +security.ssl.internal.keystore-password: keystore_password +security.ssl.internal.key-password: key_password +security.ssl.internal.truststore: /path/to/file.truststore +security.ssl.internal.truststore-password: truststore_password +``` + +When using a certificate that is not self-signed, but signed by Certified Authorities (CA), you need +to use certificate pinning to allow only a specific certificate to be trusted when establishing the +connectivity: + +```yaml +security.ssl.internal.cert.fingerprint: 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00 +``` + +### Configuring SSL for External Connectivity (REST Endpoints) + +The following setting in `flink-conf.yaml` is used to enable SSL for REST/external connections: + +```yaml +security.ssl.rest.enabled: true +``` + +{{< hint info >}} +For backwards compatibility, the **security.ssl.enabled** option still exists and enables SSL +for both internal and external/REST endpoints. +{{< /hint >}} + +By default, the keystore is used by the server REST endpoints, and the truststore is used +by the REST clients (including the CLI client) to accept the server's certificate. In the case where +the REST keystore has a self-signed certificate, the truststore must trust that certificate directly. +If the REST endpoint uses a certificate that is signed through a proper certification hierarchy, the +roots of that hierarchy should be in the truststore. + +If mutual authentication is enabled, the keystore and the truststore are used by both the server +endpoints and the REST clients. + +Take note of the following configuration settings: + +```yaml +security.ssl.rest.keystore: /path/to/file.keystore +security.ssl.rest.keystore-password: keystore_password +security.ssl.rest.key-password: key_password +security.ssl.rest.truststore: /path/to/file.truststore +security.ssl.rest.truststore-password: truststore_password +security.ssl.rest.authentication-enabled: false +``` + +### Complete List of SSL Options + +{{< generated/security_configuration >}} Review comment: These configuration options not only include SSL but also Kerberos/SASL related options. ########## File path: docs/content/docs/deployment/security/running-cluster.md ########## @@ -0,0 +1,231 @@ +--- +title: Incorporating Security Features in a Running Cluster +weight: 4 +type: docs +aliases: +--- +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +# Incorporating Security Features in a Running Cluster + +This document briefly describes how Flink security works in the context of various deployment +mechanisms (Standalone, native Kubernetes, YARN), filesystems, connectors, and state backends. + +## Deployment Modes + +Here is some information specific to each deployment mode. + +### Standalone Mode + +Steps to run a secure Flink cluster in standalone/cluster mode: + +1. Add security-related configuration options to the Flink configuration file (on all cluster nodes) + (see [here]({{< ref "docs/deployment/config" >}}#auth-with-external-systems)). +2. Ensure that the keytab file exists at the path indicated by `security.kerberos.login.keytab` on + all cluster nodes. +3. Deploy Flink cluster as normal. + +### Native Kubernetes and YARN Mode + +Steps to run a secure Flink cluster in native Kubernetes and YARN mode: + +1. Add security-related configuration options to the Flink configuration file on the client + (see [here]({{< ref "docs/deployment/config" >}}#auth-with-external-systems)). +2. Ensure that the keytab file exists at the path as indicated by `security.kerberos.login.keytab` on + the client node. +3. Deploy Flink cluster as normal. + +In YARN and native Kubernetes mode, the keytab is automatically copied from the client to the Flink +containers. + +To enable Kerberos authentication, the Kerberos configuration file is also required. This file can be +either fetched from the cluster environment or uploaded by Flink. In the latter case, you need to +configure the `security.kerberos.krb5-conf.path` to indicate the path of the Kerberos configuration +file and Flink will copy this file to its containers/pods. + +For more information, see the [documentation on YARN security](https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnApplicationSecurity.md). + +#### Using `kinit` (YARN only) + +In YARN mode, it is possible to deploy a secure Flink cluster without a keytab, using only the ticket +cache (as managed by `kinit`). This avoids the complexity of generating a keytab and avoids entrusting +the cluster manager with it. In this scenario, the Flink CLI acquires Hadoop delegation tokens (for +HDFS and for HBase). The main drawback is that the cluster is necessarily short-lived since the generated +delegation tokens will expire (typically within a week). + +Steps to run a secure Flink cluster using `kinit`: + +1. Add security-related configuration options to the Flink configuration file on the client + (see [here]({{< ref "docs/deployment/config" >}}#auth-with-external-systems)). +2. Login using the `kinit` command. +3. Deploy Flink cluster as normal. + + +## SSL - Tips for YARN Deployment + +For YARN, you can use the tools of Yarn to help: + +- Configuring security for internal communication is exactly the same as in the example above. + +- To secure the REST endpoint, you need to issue the REST endpoint's certificate such that it is + valid for all hosts that the JobManager may get deployed to. This can be done with a wild card + DNS name, or by adding multiple DNS names. + +- The easiest way to deploy keystores and truststore is by YARN client's *ship files* option (`-yt`). + Copy the keystore and truststore files into a local directory (say `deploy-keys/`) and start the + YARN session as follows: `flink run -m yarn-cluster -yt deploy-keys/ flinkapp.jar` + +- When deployed using YARN, Flink's web dashboard is accessible through YARN proxy's Tracking URL. + To ensure that the YARN proxy is able to access Flink's HTTPS URL, you need to configure YARN proxy + to accept Flink's SSL certificates. + For that, add the custom CA certificate into Java's default truststore on the YARN Proxy node. + + +## Creating and Deploying Keystores and Truststores + +Keys, Certificates, and the Keystores and Truststores can be generated using the [keytool utility](https://docs.oracle.com/javase/8/docs/technotes/tools/unix/keytool.html). +You need to have an appropriate Java Keystore and Truststore accessible from each node in the Flink cluster. + +- For standalone setups, this means copying the files to each node, or adding them to a shared mounted directory. +- For container based setups, add the keystore and truststore files to the container images. +- For Yarn setups, the cluster deployment phase can automatically distribute the keystore and truststore files. + +For the externally facing REST endpoint, the common name or subject alternative names in the certificate +should match the node's hostname and IP address. + +## Example SSL Setup Standalone and Kubernetes + +**Internal Connectivity** + +Execute the following keytool commands to create a key pair in a keystore: + +```bash +$ keytool -genkeypair \ + -alias flink.internal \ + -keystore internal.keystore \ + -dname "CN=flink.internal" \ + -storepass internal_store_password \ + -keyalg RSA \ + -keysize 4096 \ + -storetype PKCS12 +``` + +The single key/certificate in the keystore is used the same way by the server and client endpoints +(mutual authentication). The key pair acts as the shared secret for internal security, and we can +directly use it as keystore and truststore. + +```yaml +security.ssl.internal.enabled: true +security.ssl.internal.keystore: /path/to/flink/conf/internal.keystore +security.ssl.internal.truststore: /path/to/flink/conf/internal.keystore +security.ssl.internal.keystore-password: internal_store_password Review comment: +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
