José Armando García Sancio created KAFKA-20661:
--------------------------------------------------
Summary: KRaft unavailability when controller are misconfigured
Key: KAFKA-20661
URL: https://issues.apache.org/jira/browse/KAFKA-20661
Project: Kafka
Issue Type: Bug
Components: kraft
Affects Versions: 3.9.0
Reporter: José Armando García Sancio
Assignee: José Armando García Sancio
Fix For: 4.4.0
It is possible for a controller quorum to become unavailable and get into an
unrecoverable state if the controller's advertised listener is misconfigured.
If the user misconfigures the controller's advertised listener to an address
that is not routable from the other controller, the controller nodes won't be
able to reach each other. This would cause the controller cluster to become
unavailable.
Since the controllers' unreachable endpoints were persisted in the cluster
metadata partition by the active controller (kraft leader), if the active
controller loses leadership a new leader won't be established since controllers
need to reach each other through the VOTE request to establish leadership.
The solution to this problem is for the leader to test the default listener
before accepting an UPDATE_VOTER request from the inactive controller. This
guarantees that at least the current leader is able to reach all of the other
controllers.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)