rondagostino commented on a change in pull request #11503: URL: https://github.com/apache/kafka/pull/11503#discussion_r751611823
########## File path: core/src/main/scala/kafka/server/KafkaConfig.scala ########## @@ -2007,8 +2007,13 @@ class KafkaConfig private(doLog: Boolean, val props: java.util.Map[_, _], dynami "offsets.commit.required.acks must be greater or equal -1 and less or equal to offsets.topic.replication.factor") require(BrokerCompressionCodec.isValid(compressionType), "compression.type : " + compressionType + " is not valid." + " Valid options are " + BrokerCompressionCodec.brokerCompressionOptions.mkString(",")) - require(!processRoles.contains(ControllerRole) || controllerListeners.nonEmpty, - s"${KafkaConfig.ControllerListenerNamesProp} cannot be empty if the server has the controller role") + if (usesSelfManagedQuorum) { Review comment: Ok, I pushed a commit that checks the consistency of the ports between the `controller.quorum.voters` and (`controller.listener.names`/`listeners`) configs. This can only be done for the controller case since there is nothing in `listeners` for the broker-only case. Currently both `RaftManager` and `RaftControllerNodeProvider` take `config.controllerListenerNames.head` to determine how to connect. `RaftControllerNodeProvider` extends `ControllerNodeProvider` which exposes a single listener name/security protocol/SASL mechanism. So we aren't going to make `RaftControllerNodeProvider` sophisticated about which security protocol/SASL mechanism to use based on voter ports -- it has to always use a single one. This makes sense given that migrating communication to the ZK-based brokers (any one of which can become the controller) from one security protocol to another involves adding the second security protocol listener everywhere, migrating everybody over to using that second security protocol (e.g. via `inter.broker.security.protocol`), and then dropping the first listener -- so 3 rolls. We would do something similar in KRaft: add the second controller listener on all controllers, migrate all brokers to use it (by making that listener the first one in the controller listener names list), then drop the original listener on the controllers. It now becomes appropriate to discuss `RaftManager`. `RaftManager` is used on all KRaft nodes, including any that are broker-only nodes, and in that broker-only case it is not going to be unable to use the ports in `controller.quorum.voters` to identify a security protocol because there is nothing in `listeners` to find based on a port number. So it has to use `config.controllerListenerNames.head` in this case. `RaftManager` could potentially be sophisticated about which security protocol/SASL mechanism to use based on voter ports in the case where we have the controller role -- there are listeners that it can identify based on port numbers. But I don't think we would want to in this case, which generally corresponds to the above migration scenario. I think the way that scenario would work would be as follows: 1. Append a second entry to `controller.listener.names` to all nodes with the controller role plus a corresponding entry in `listeners`, and roll all controllers. This has no effect other than adding the second active controller listener to each controller -- in particular, they won't hear anything yet. 2. Now that all controllers have both listeners, swap the order of `controller.listener.names` and update `controller.quorum.voters` with the new port, and roll all controllers again. Now the controllers are talking to each other on that new listener/security protocol, and any brokers running in combined mode are as well. 3. Set `controller.listener.names` to be the new listener name and update `controller.quorum.voters` on any broker-only nodes and roll them. Now the broker-only nodes are also talking to the controllers over the new security protocol. 4. Eliminate the old entry from `controller.listener.names` (it is now the second in the list at this point, and we are deleting it) on any controller node and roll the controllers a third time. We have completed the migration. So it takes 4 rolls: 3 rolls of the controller nodes and one roll of the broker-only nodes. The point here is that we never have a situation where a running instance of a broker or a controller is talking to some controllers with one security protocol and others with another security protocol. Every node always talks to all controllers over the same security protocol at any particular moment. Assuming this all makes sense and we agree, I think this has the following implications: * I added validation of port consistency for controllers by checking that the port existed in at least one controller listener. I think this validation should be more strict: the port must match the first controller listener listed. * For broker-only nodes, `controller.listener.names` should only have a single entry. Given the point above that this is technically an incompatible change, I think the suggestion to log a warning is appropriate. Thoughts? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org