Swathi Mocharla created KAFKA-20583:
---------------------------------------
Summary: KRaft controller does not support dynamic SSL certificate
reload for CONTROLLER listener unlike broker listeners
Key: KAFKA-20583
URL: https://issues.apache.org/jira/browse/KAFKA-20583
Project: Kafka
Issue Type: New Feature
Components: kraft
Reporter: Swathi Mocharla
In a KRaft deployment, Traffic Broker (TB) listeners support dynamic SSL
certificate reload via {{{}kafka-configs --alter{}}}. This works because broker
listeners are managed by {{{}KafkaServer{}}}, which implements the
{{Reconfigurable}} interface and wires SSL context updates at runtime without
requiring a restart.
The Controller Quorum (CQ) node exposes a *CONTROLLER listener* for KRaft
metadata quorum traffic. There is *no equivalent mechanism* to dynamically
reload CQ's server-side SSL certificates (keystore / truststore). The only way
to rotate CQ server certificates is a pod restart.
----
*Steps to Demonstrate the Gap:*
# Deploy Kafka in KRaft mode with SSL enabled on the CONTROLLER listener
({{{}controller.listener.names=CONTROLLER{}}})
# For TB broker listeners, run:
kafka-configs --alter --entity-type brokers \
--add-config ssl.keystore.location=<new_path>,ssl.keystore.password=<new_pass>
*Observe:* TB listener picks up the new certificate dynamically — no restart
needed.
# Rotate the server certificate in the CQ keystore (update the mounted
keystore file or Kubernetes secret).
# *Observe:* CQ CONTROLLER listener continues to serve the old certificate
indefinitely — there is no {{kafka-configs}} command or API equivalent for CQ.
The dynamic reconfiguration path does not exist for {{{}KafkaRaftClient{}}}.
# Only a *pod restart* causes CQ to load the new certificate.
----
*Expected:* CQ's CONTROLLER listener reloads its server-side SSL certificate
dynamically, consistent with how TB broker listeners handle certificate
rotation. At minimum, a {{kafka-configs --alter}} equivalent should be
supported for the CONTROLLER listener's SSL context.
*Actual:* There is no mechanism to rotate CQ server certificates at runtime.
Pod restart is the only option. The gap is not a misconfiguration — dynamic SSL
reconfiguration for {{KafkaRaftClient}} was simply never implemented.
----
*Root Cause:* The CONTROLLER listener is handled by {{{}KafkaRaftClient{}}},
which does not implement or register a dynamic SSL reconfiguration hook. Unlike
{{KafkaServer}} (used for TB broker listeners), {{{}KafkaRaftClient{}}}'s SSL
channel builder is initialized once at startup and is never updated. The
{{Reconfigurable}} interface is not wired for the KRaft controller's SSL
context.
Relevant classes:
* {{KafkaRaftClient.java}}
* {{RaftManager.java}}
* {{SslFactory.java}} / {{SslChannelBuilder.java}}
----
*Impact:*
* Certificate rotation for CQ requires a {*}rolling pod restart{*}, causing
temporary quorum disruption
* Particularly significant in environments with short certificate validity
periods or automated CA rotation
* Operational burden: CQ restarts must be coordinated carefully to avoid
losing quorum during rotation
* Inconsistent behavior between TB (dynamic) and CQ (restart required)
complicates certificate lifecycle management
--
This message was sent by Atlassian Jira
(v8.20.10#820010)