clolov commented on code in PR #22398: URL: https://github.com/apache/kafka/pull/22398#discussion_r3324669783
########## docs/security/security-model.md: ########## @@ -0,0 +1,132 @@ +--- +title: Security Model +description: Apache Kafka Security Model +weight: 8 +tags: ['kafka', 'docs', 'security'] +aliases: +keywords: +type: docs +--- + +<!-- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> + + +## Things You Need To Know + +- **Security is off by default.** A freshly-installed Apache Kafka cluster accepts unauthenticated `PLAINTEXT` connections on every listener and applies no authorization. This is appropriate only for closed test environments. Production deployments **must** explicitly configure authentication, authorization, and transport encryption before being exposed to any untrusted network. +- **Apache Kafka assumes a trusted operator.** Anyone with shell access to a broker, controller, or the underlying disks can read every topic, forge any principal, and rewrite ACLs. The security model protects messages in transit and arbitrates client access — it does not defend brokers from their own administrators. +- **Apache Kafka assumes a trusted broker fleet.** Brokers and KRaft controllers exchange records, replication state, and metadata over the inter-broker and controller listeners. Any host that can authenticate on those listeners is effectively part of the cluster's trust boundary. +- **The data plane and the control plane have different exposure.** Producer/consumer traffic, the Admin API, the Kafka Connect REST API, and JMX each have distinct authentication and authorization stories. Operators must configure them independently — securing one does not secure the others. +- **Apache Kafka does not encrypt data at rest.** Log segments, index files, and snapshots are written as plain bytes. At-rest confidentiality is the responsibility of the underlying filesystem, block device, or message-level encryption performed by producers. +- **Reporting vulnerabilities.** Suspected security issues should be reported privately to `[email protected]` per the [ASF security process](https://www.apache.org/security/). Do not file public JIRA tickets, GitHub issues, or mailing-list posts for unpatched vulnerabilities. + +## Listeners and the Network Boundary + +Apache Kafka brokers expose one or more **listeners**, each with an independent security configuration selected by `listener.security.protocol.map`. The four protocols are: + +| Protocol | Authentication | Encryption | +|------------------|------------------------|------------| +| `PLAINTEXT` | None | None | +| `SSL` | Optional mTLS | TLS | +| `SASL_PLAINTEXT` | SASL | None | +| `SASL_SSL` | SASL (+ optional mTLS) | TLS | + +`inter.broker.listener.name` and `controller.listener.names` select which listeners carry replication and KRaft traffic respectively. A common pattern is to keep these on a dedicated internal listener (`SASL_SSL` or `SSL`) that is firewalled off from clients, so that a compromise of a client-facing listener cannot impersonate a broker. + +Operators should: + +1. Bind external listeners only to interfaces reachable by intended clients. +2. Treat `advertised.listeners` as part of the security configuration — clients connect to whatever the broker advertises after the initial metadata fetch. +3. Never expose the controller listener to client networks. + +## Authentication + +Apache Kafka supports two complementary authentication mechanisms; either may be used, and both can be combined on a `SASL_SSL` listener. + +### TLS Client Authentication (mTLS) + +When `ssl.client.auth` is `required` on a TLS listener, the client's X.509 certificate is verified against the broker's truststore. The authenticated principal is derived from the certificate's distinguished name via `ssl.principal.mapping.rules` (or a custom `KafkaPrincipalBuilder`). + +mTLS is the recommended mechanism for broker-to-broker and controller-to-broker traffic, because it requires no shared password material and rotates with the rest of the PKI. + +### SASL + +Apache Kafka ships with five SASL mechanisms, enabled per-listener via `sasl.enabled.mechanisms`: + +- **`GSSAPI`** — Kerberos. Recommended for environments that already operate a KDC; principals and credentials are managed externally. +- **`SCRAM-SHA-256` / `SCRAM-SHA-512`** — Salted challenge/response with credentials stored in the cluster metadata. Credentials are managed with `kafka-configs.sh --alter --add-config 'SCRAM-SHA-512=...'`. +- **`OAUTHBEARER`** — OAuth 2.0 bearer tokens, suitable for integration with an identity provider. The default unsecured implementation is for testing only; production deployments must configure a JWKS endpoint and validator. +- **`PLAIN`** — Username/password sent in cleartext over the SASL channel. Acceptable only inside a `SASL_SSL` listener; never use it with `SASL_PLAINTEXT`. + +#### Delegation Tokens + +Once a client has authenticated via SASL or mTLS, it can request a short-lived **delegation token** that is then used as a `SCRAM-SHA-256` credential for subsequent connections. Delegation tokens are intended for distributed frameworks (Spark, Flink, Connect workers) that need to fan out to many tasks without distributing the original credential. Tokens inherit the requester's principal and ACLs, expire on a fixed schedule (`delegation.token.expiry.time.ms`), and can be invalidated by the owner. + +## Authorization + +Authentication establishes a `KafkaPrincipal`; authorization decides what that principal may do. Authorization is performed by the configured `authorizer.class.name`. Apache Kafka ships `org.apache.kafka.metadata.authorizer.StandardAuthorizer` for KRaft clusters. + +ACLs are tuples of `(principal, host, operation, resource pattern, permission)`. Resources are typed (`Topic`, `Group`, `Cluster`, `TransactionalId`, `DelegationToken`, `User`) and patterns may be `LITERAL` or `PREFIXED`. + +Defaults worth understanding: + +- If no authorizer is configured, **all authenticated principals have full access**. Configuring authentication without an authorizer provides identity but no authorization. +- If an authorizer is configured but no ACLs match, access is **denied**. The exception is the principals listed in `super.users`, which bypass ACL checks entirely; treat that list as you would a root password. +- `allow.everyone.if.no.acl.found=true` reverses the default-deny behaviour for resources that have no ACLs at all. It is a transitional aid for adding authorization to existing clusters and should not remain set in steady state. + +ACLs are managed with `kafka-acls.sh` or the AdminClient `createAcls`/`deleteAcls` APIs, which are themselves gated by ACLs on the `Cluster` resource. + +## Encryption in Transit + +TLS is configured per-listener via the standard `ssl.*` properties (`ssl.keystore.*`, `ssl.truststore.*`, `ssl.protocol`, `ssl.cipher.suites`, `ssl.enabled.protocols`). Recommendations: + +- Disable TLS versions below 1.2; prefer 1.3 where the JDK supports it. +- Use distinct keystores for the inter-broker listener and any client-facing listener so that a leaked client-facing key cannot impersonate a broker. +- Set `ssl.endpoint.identification.algorithm=https` on clients (the default since 2.0) so that the broker's certificate must match its hostname. +- Rotate keystores using the dynamic broker configuration mechanism (`kafka-configs.sh --entity-type brokers --alter --add-config ...`) to avoid restarts. + +Kafka Connect, MirrorMaker 2, Kafka Streams, and the Schema Registry-style ecosystem tools all consume the same `ssl.*` and `sasl.*` client configs — securing the broker is necessary but not sufficient. Review Comment: Good catch, apologies, this is leftover from an earlier draft where I was also talking about how do tools interact with Kafka which I ended up scrapping. I will remove this! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
