[ 
https://issues.apache.org/jira/browse/KAFKA-16820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arushi Helms updated KAFKA-16820:
---------------------------------
    Description: 
 

We are migrating our Kafka cluster from zookeeper to Kraft mode. We are running 
individual brokers and controllers with TLS enabled and IPs are given for 
communication. 
TLS enabled setup works fine among the brokers and the certificate looks 
something like:
{noformat}
Common Name: *.kafka.service.consul
Subject Alternative Names: *.kafka.service.consul, IP 
Address:10.87.171.84{noformat}
Note:
 * The DNS name for the node does not match the CN but since we are using IPs 
as communication, we have provided IPs as SAN.
 * Same with the controllers, IPs are given as SAN in the certificate. 
 * Issue is not related to the migration so just sharing configuration relevant 
for the TLS piece. 



In the current setup I am running 3 brokers and 3 controllers. 

Relevant controller configurations from one of the controllers:
{noformat}
KAFKA_CFG_PROCESS_ROLES=controller 
KAFKA_KRAFT_CLUSTER_ID=5kztjhJ4SxSu-kdiEYDUow
KAFKA_CFG_NODE_ID=6 
KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=4@10.87.170.83:9097,5@10.87.170.9:9097,6@10.87.170.6:9097
 
KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER 
KAFKA_CFG_INTER_BROKER_LISTENER_NAME=INSIDE_SSL
KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CONTROLLER:SSL,INSIDE_SSL:SSL 
KAFKA_CFG_LISTENERS=CONTROLLER://10.87.170.6:9097{noformat}
Relevant broker configuration from one of the brokers:
{noformat}
KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER
KAFKA_CFG_INTER_BROKER_LISTENER_NAME=INSIDE_SSL
KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=4@10.87.170.83:9097,5@10.87.170.9:9097,6@10.87.170.6:9097
 
KAFKA_CFG_PROCESS_ROLES=broker 
KAFKA_CFG_NODE_ID=3 
KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=INSIDE_SSL:SSL,OUTSIDE_SSL:SSL,CONTROLLER:SSL
 
KAFKA_CFG_LISTENERS=INSIDE_SSL://10.87.170.78:9093,OUTSIDE_SSL://10.87.170.78:9096
 
KAFKA_CFG_ADVERTISED_LISTENERS=INSIDE_SSL://10.87.170.78:9093,OUTSIDE_SSL://10.87.170.78:9096{noformat}
 

ISSUE 1: 
With this setup Kafka broker is failing to connect to the controller, see the 
following error:
{noformat}
2024-05-22 17:53:46,413] ERROR 
[broker-2-to-controller-heartbeat-channel-manager]: Request 
BrokerRegistrationRequestData(brokerId=2, clusterId='5kztjhJ4SxSu-kdiEYDUow', 
incarnationId=7741fgH6T4SQqGsho8E6mw, listeners=[Listener(name='INSIDE_SSL', 
host='10.87.170.81', port=9093, securityProtocol=1), Listener(name='INSIDE', 
host='10.87.170.81', port=9094, securityProtocol=0), Listener(name='OUTSIDE', 
host='10.87.170.81', port=9092, securityProtocol=0), 
Listener(name='OUTSIDE_SSL', host='10.87.170.81', port=9096, 
securityProtocol=1)], features=[Feature(name='metadata.version', 
minSupportedVersion=1, maxSupportedVersion=19)], rack=null, 
isMigratingZkBroker=false, logDirs=[TJssfKDD-iBFYfIYCKOcew], 
previousBrokerEpoch=-1) failed due to authentication error with controller 
(kafka.server.NodeToControllerRequestThread)org.apache.kafka.common.errors.SslAuthenticationException:
 SSL handshake failedCaused by: javax.net.ssl.SSLHandshakeException: No subject 
alternative DNS name matching 
cp-internal-onecloud-kfkc1.node.cp-internal-onecloud.consul found.  at 
java.base/sun.security.ssl.Alert.createSSLException(Alert.java:131)  at 
java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:378) at 
java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:321) at 
java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:316) at 
java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1351)
  at 
java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.onConsumeCertificate(CertificateMessage.java:1226)
      at 
java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.consume(CertificateMessage.java:1169)
   at java.base/sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:396)    
   at 
java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:480) 
     at 
java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1277)
  at 
java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1264)
  at 
java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
     at 
java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask.run(SSLEngineImpl.java:1209)
  at 
org.apache.kafka.common.network.SslTransportLayer.runDelegatedTasks(SslTransportLayer.java:435)
      at 
org.apache.kafka.common.network.SslTransportLayer.handshakeUnwrap(SslTransportLayer.java:523)
        at 
org.apache.kafka.common.network.SslTransportLayer.doHandshake(SslTransportLayer.java:373)
    at 
org.apache.kafka.common.network.SslTransportLayer.handshake(SslTransportLayer.java:293)
      at 
org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:178)  at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:543)   
     at org.apache.kafka.common.network.Selector.poll(Selector.java:481)     at 
org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:585)  at 
org.apache.kafka.server.util.InterBrokerSendThread.pollOnce(InterBrokerSendThread.java:109)
  at 
kafka.server.NodeToControllerRequestThread.doWork(NodeToControllerChannelManager.scala:382)
  at 
org.apache.kafka.server.util.ShutdownableThread.run(ShutdownableThread.java:131)Caused
 by: java.security.cert.CertificateException: No subject alternative DNS name 
matching cp-internal-onecloud-kfkc1.node.cp-internal-onecloud.consul found.     
 at 
java.base/sun.security.util.HostnameChecker.matchDNS(HostnameChecker.java:212)  
     at 
java.base/sun.security.util.HostnameChecker.match(HostnameChecker.java:103)  at 
java.base/sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:458)
 at 
java.base/sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:418)
 at 
java.base/sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:292)
  at 
java.base/sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:144)
    at 
java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1329)
  ... 19 more{noformat}
 

ISSUE 2:
Looks like kraft controller does the reverse DNS lookup for itself as well 
while starting and we are seeing DNS name matching issue in the controller as 
well. Log snippet from Controller with node ID 4:
{noformat}
[2024-05-16 20:57:07,962] INFO [SocketServer listenerType=CONTROLLER, nodeId=4] 
Failed authentication with /10.87.170.83 
(channelId=10.87.170.83:9097-10.87.170.83:42548-3) (SSL handshake failed) 
(org.apache.kafka.common.network.Selector)[2024-05-16 20:57:11,118] INFO 
[ControllerRegistrationManager id=4 incarnation=HWT3UBxJSPGuefZ9xdqH-g] 
sendControllerRegistration: attempting to send 
ControllerRegistrationRequestData(controllerId=4, 
incarnationId=HWT3UBxJSPGuefZ9xdqH-g, zkMigrationReady=true, 
listeners=[Listener(name='CONTROLLER', host='10.87.170.83', port=9097, 
securityProtocol=1)], features=[Feature(name='metadata.version', 
minSupportedVersion=1, maxSupportedVersion=19)]) 
(kafka.server.ControllerRegistrationManager)[2024-05-16 20:57:11,129] INFO 
[NodeToControllerChannelManager id=4 name=registration] Failed authentication 
with cp-internal-onecloud-kfkc1.cp-internal-onecloud-kfkc1/10.87.170.83 
(channelId=4) (SSL handshake failed) 
(org.apache.kafka.common.network.Selector)[2024-05-16 20:57:11,130] INFO 
[NodeToControllerChannelManager id=4 name=registration] Node 4 disconnected. 
(org.apache.kafka.clients.NetworkClient)[2024-05-16 20:57:11,130] INFO 
[SocketServer listenerType=CONTROLLER, nodeId=4] Failed authentication with 
/10.87.170.83 (channelId=10.87.170.83:9097-10.87.170.83:42564-4) (SSL handshake 
failed) (org.apache.kafka.common.network.Selector)[2024-05-16 20:57:11,130] 
ERROR [NodeToControllerChannelManager id=4 name=registration] Connection to 
node 4 
(cp-internal-onecloud-kfkc1.cp-internal-onecloud-kfkc1/10.87.170.83:9097) 
failed authentication due to: SSL handshake failed 
(org.apache.kafka.clients.NetworkClient)[2024-05-16 20:57:11,131] ERROR 
[controller-4-to-controller-registration-channel-manager]: Failed to send the 
following request due to authentication error: 
ClientRequest(expectResponse=true, 
callback=kafka.server.NodeToControllerRequestThread$$Lambda$850/0x00007fee184be288@41a1ff51,
 destination=4, correlationId=6, clientId=4, createdTimeMs=1715893031119, 
requestBuilder=ControllerRegistrationRequestData(controllerId=4, 
incarnationId=HWT3UBxJSPGuefZ9xdqH-g, zkMigrationReady=true, 
listeners=[Listener(name='CONTROLLER', host='10.87.170.83', port=9097, 
securityProtocol=1)], features=[Feature(name='metadata.version', 
minSupportedVersion=1, maxSupportedVersion=19)])) 
(kafka.server.NodeToControllerRequestThread)[2024-05-16 20:57:11,131] ERROR 
[controller-4-to-controller-registration-channel-manager]: Request 
ControllerRegistrationRequestData(controllerId=4, 
incarnationId=HWT3UBxJSPGuefZ9xdqH-g, zkMigrationReady=true, 
listeners=[Listener(name='CONTROLLER', host='10.87.170.83', port=9097, 
securityProtocol=1)], features=[Feature(name='metadata.version', 
minSupportedVersion=1, maxSupportedVersion=19)]) failed due to authentication 
error with controller 
(kafka.server.NodeToControllerRequestThread)org.apache.kafka.common.errors.SslAuthenticationException:
 SSL handshake failedCaused by: javax.net.ssl.SSLHandshakeException: No subject 
alternative DNS name matching 
cp-internal-onecloud-kfkc1.cp-internal-onecloud-kfkc1 found.  at 
java.base/sun.security.ssl.Alert.createSSLException(Alert.java:131)  at 
java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:378) at 
java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:321) at 
java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:316) at 
java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1351)
  at 
java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.onConsumeCertificate(CertificateMessage.java:1226)
      at 
java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.consume(CertificateMessage.java:1169)
   at java.base/sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:396)    
   at 
java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:480) 
     at 
java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1277)
  at 
java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1264)
  at 
java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
     at 
java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask.run(SSLEngineImpl.java:1209)
  at 
org.apache.kafka.common.network.SslTransportLayer.runDelegatedTasks(SslTransportLayer.java:435)
      at 
org.apache.kafka.common.network.SslTransportLayer.handshakeUnwrap(SslTransportLayer.java:523)
        at 
org.apache.kafka.common.network.SslTransportLayer.doHandshake(SslTransportLayer.java:373)
    at 
org.apache.kafka.common.network.SslTransportLayer.handshake(SslTransportLayer.java:293)
      at 
org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:178)  at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:543)   
     at org.apache.kafka.common.network.Selector.poll(Selector.java:481)     at 
org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:585)  at 
org.apache.kafka.server.util.InterBrokerSendThread.pollOnce(InterBrokerSendThread.java:109)
  at 
kafka.server.NodeToControllerRequestThread.doWork(NodeToControllerChannelManager.scala:382)
  at 
org.apache.kafka.server.util.ShutdownableThread.run(ShutdownableThread.java:131)Caused
 by: java.security.cert.CertificateException: No subject alternative DNS name 
matching cp-internal-onecloud-kfkc1.cp-internal-onecloud-kfkc1 found.    at 
java.base/sun.security.util.HostnameChecker.matchDNS(HostnameChecker.java:212)  
     at 
java.base/sun.security.util.HostnameChecker.match(HostnameChecker.java:103)  at 
java.base/sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:458)
 at 
java.base/sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:418)
 at 
java.base/sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:292)
  at 
java.base/sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:144)
    at 
java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1329){noformat}
Queries:
1. Given IPs for communication and IPs as SANs. Why does inter broker 
communication works fine but not broker-controller and controller-controller? 
2. Why Is controller doing reverse DNS lookup? Is there a way to disable that? 

Note: we do not wish to set KAFKA_CFG_SSL_ENDPOINT_IDENTIFICATION_ALGORITHM=" " 
as it would disable IP matching as well, per our understanding.

Please let me know if you would like to know about any other configuration and 
logs. 

 

  was:
 

We are migrating our Kafka cluster from zookeeper to Kraft mode. We are running 
individual brokers and controllers with TLS enabled and IPs are given for 
communication. 
TLS enabled setup works fine among the brokers and the certificate looks 
something like:
h5.  
{noformat}
Common Name: *.kafka.service.consul
Subject Alternative Names: *.kafka.service.consul, IP 
Address:10.87.171.84{noformat}
Note: The DNS name for the node does not match the CN but since we are using 
IPs as communication, we have provided IPs as SAN. 

Same with the controllers, IPs are given as SAN in the certificate. 

In the current setup I am running 3 brokers and 3 controllers. 

Relevant controller configurations from one of the controllers:
{noformat}
KAFKA_CFG_PROCESS_ROLES=controller 
KAFKA_KRAFT_CLUSTER_ID=5kztjhJ4SxSu-kdiEYDUow
KAFKA_CFG_NODE_ID=6 
KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=4@10.87.170.83:9097,5@10.87.170.9:9097,6@10.87.170.6:9097
 
KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER 
KAFKA_CFG_INTER_BROKER_LISTENER_NAME=INSIDE_SSL
KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CONTROLLER:SSL,INSIDE_SSL:SSL 
KAFKA_CFG_LISTENERS=CONTROLLER://10.87.170.6:9097{noformat}
Relevant broker configuration from one of the brokers:
{noformat}
KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER
KAFKA_CFG_INTER_BROKER_LISTENER_NAME=INSIDE_SSL
KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=4@10.87.170.83:9097,5@10.87.170.9:9097,6@10.87.170.6:9097
 
KAFKA_CFG_PROCESS_ROLES=broker 
KAFKA_CFG_NODE_ID=3 
KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=INSIDE_SSL:SSL,OUTSIDE_SSL:SSL,CONTROLLER:SSL
 
KAFKA_CFG_LISTENERS=INSIDE_SSL://10.87.170.78:9093,OUTSIDE_SSL://10.87.170.78:9096
 
KAFKA_CFG_ADVERTISED_LISTENERS=INSIDE_SSL://10.87.170.78:9093,OUTSIDE_SSL://10.87.170.78:9096{noformat}
 

ISSUE 1: 
With this setup Kafka broker is failing to connect to the controller, see the 
following error:
{noformat}
2024-05-22 17:53:46,413] ERROR 
[broker-2-to-controller-heartbeat-channel-manager]: Request 
BrokerRegistrationRequestData(brokerId=2, clusterId='5kztjhJ4SxSu-kdiEYDUow', 
incarnationId=7741fgH6T4SQqGsho8E6mw, listeners=[Listener(name='INSIDE_SSL', 
host='10.87.170.81', port=9093, securityProtocol=1), Listener(name='INSIDE', 
host='10.87.170.81', port=9094, securityProtocol=0), Listener(name='OUTSIDE', 
host='10.87.170.81', port=9092, securityProtocol=0), 
Listener(name='OUTSIDE_SSL', host='10.87.170.81', port=9096, 
securityProtocol=1)], features=[Feature(name='metadata.version', 
minSupportedVersion=1, maxSupportedVersion=19)], rack=null, 
isMigratingZkBroker=false, logDirs=[TJssfKDD-iBFYfIYCKOcew], 
previousBrokerEpoch=-1) failed due to authentication error with controller 
(kafka.server.NodeToControllerRequestThread)org.apache.kafka.common.errors.SslAuthenticationException:
 SSL handshake failedCaused by: javax.net.ssl.SSLHandshakeException: No subject 
alternative DNS name matching 
cp-internal-onecloud-kfkc1.node.cp-internal-onecloud.consul found.  at 
java.base/sun.security.ssl.Alert.createSSLException(Alert.java:131)  at 
java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:378) at 
java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:321) at 
java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:316) at 
java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1351)
  at 
java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.onConsumeCertificate(CertificateMessage.java:1226)
      at 
java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.consume(CertificateMessage.java:1169)
   at java.base/sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:396)    
   at 
java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:480) 
     at 
java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1277)
  at 
java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1264)
  at 
java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
     at 
java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask.run(SSLEngineImpl.java:1209)
  at 
org.apache.kafka.common.network.SslTransportLayer.runDelegatedTasks(SslTransportLayer.java:435)
      at 
org.apache.kafka.common.network.SslTransportLayer.handshakeUnwrap(SslTransportLayer.java:523)
        at 
org.apache.kafka.common.network.SslTransportLayer.doHandshake(SslTransportLayer.java:373)
    at 
org.apache.kafka.common.network.SslTransportLayer.handshake(SslTransportLayer.java:293)
      at 
org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:178)  at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:543)   
     at org.apache.kafka.common.network.Selector.poll(Selector.java:481)     at 
org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:585)  at 
org.apache.kafka.server.util.InterBrokerSendThread.pollOnce(InterBrokerSendThread.java:109)
  at 
kafka.server.NodeToControllerRequestThread.doWork(NodeToControllerChannelManager.scala:382)
  at 
org.apache.kafka.server.util.ShutdownableThread.run(ShutdownableThread.java:131)Caused
 by: java.security.cert.CertificateException: No subject alternative DNS name 
matching cp-internal-onecloud-kfkc1.node.cp-internal-onecloud.consul found.     
 at 
java.base/sun.security.util.HostnameChecker.matchDNS(HostnameChecker.java:212)  
     at 
java.base/sun.security.util.HostnameChecker.match(HostnameChecker.java:103)  at 
java.base/sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:458)
 at 
java.base/sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:418)
 at 
java.base/sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:292)
  at 
java.base/sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:144)
    at 
java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1329)
  ... 19 more{noformat}
 

ISSUE 2:
Looks like kraft controller does the reverse DNS lookup for itself as well 
while starting and we are seeing DNS name matching issue in the controller as 
well. Log snippet from Controller with node ID 4:
{noformat}
[2024-05-16 20:57:07,962] INFO [SocketServer listenerType=CONTROLLER, nodeId=4] 
Failed authentication with /10.87.170.83 
(channelId=10.87.170.83:9097-10.87.170.83:42548-3) (SSL handshake failed) 
(org.apache.kafka.common.network.Selector)[2024-05-16 20:57:11,118] INFO 
[ControllerRegistrationManager id=4 incarnation=HWT3UBxJSPGuefZ9xdqH-g] 
sendControllerRegistration: attempting to send 
ControllerRegistrationRequestData(controllerId=4, 
incarnationId=HWT3UBxJSPGuefZ9xdqH-g, zkMigrationReady=true, 
listeners=[Listener(name='CONTROLLER', host='10.87.170.83', port=9097, 
securityProtocol=1)], features=[Feature(name='metadata.version', 
minSupportedVersion=1, maxSupportedVersion=19)]) 
(kafka.server.ControllerRegistrationManager)[2024-05-16 20:57:11,129] INFO 
[NodeToControllerChannelManager id=4 name=registration] Failed authentication 
with cp-internal-onecloud-kfkc1.cp-internal-onecloud-kfkc1/10.87.170.83 
(channelId=4) (SSL handshake failed) 
(org.apache.kafka.common.network.Selector)[2024-05-16 20:57:11,130] INFO 
[NodeToControllerChannelManager id=4 name=registration] Node 4 disconnected. 
(org.apache.kafka.clients.NetworkClient)[2024-05-16 20:57:11,130] INFO 
[SocketServer listenerType=CONTROLLER, nodeId=4] Failed authentication with 
/10.87.170.83 (channelId=10.87.170.83:9097-10.87.170.83:42564-4) (SSL handshake 
failed) (org.apache.kafka.common.network.Selector)[2024-05-16 20:57:11,130] 
ERROR [NodeToControllerChannelManager id=4 name=registration] Connection to 
node 4 
(cp-internal-onecloud-kfkc1.cp-internal-onecloud-kfkc1/10.87.170.83:9097) 
failed authentication due to: SSL handshake failed 
(org.apache.kafka.clients.NetworkClient)[2024-05-16 20:57:11,131] ERROR 
[controller-4-to-controller-registration-channel-manager]: Failed to send the 
following request due to authentication error: 
ClientRequest(expectResponse=true, 
callback=kafka.server.NodeToControllerRequestThread$$Lambda$850/0x00007fee184be288@41a1ff51,
 destination=4, correlationId=6, clientId=4, createdTimeMs=1715893031119, 
requestBuilder=ControllerRegistrationRequestData(controllerId=4, 
incarnationId=HWT3UBxJSPGuefZ9xdqH-g, zkMigrationReady=true, 
listeners=[Listener(name='CONTROLLER', host='10.87.170.83', port=9097, 
securityProtocol=1)], features=[Feature(name='metadata.version', 
minSupportedVersion=1, maxSupportedVersion=19)])) 
(kafka.server.NodeToControllerRequestThread)[2024-05-16 20:57:11,131] ERROR 
[controller-4-to-controller-registration-channel-manager]: Request 
ControllerRegistrationRequestData(controllerId=4, 
incarnationId=HWT3UBxJSPGuefZ9xdqH-g, zkMigrationReady=true, 
listeners=[Listener(name='CONTROLLER', host='10.87.170.83', port=9097, 
securityProtocol=1)], features=[Feature(name='metadata.version', 
minSupportedVersion=1, maxSupportedVersion=19)]) failed due to authentication 
error with controller 
(kafka.server.NodeToControllerRequestThread)org.apache.kafka.common.errors.SslAuthenticationException:
 SSL handshake failedCaused by: javax.net.ssl.SSLHandshakeException: No subject 
alternative DNS name matching 
cp-internal-onecloud-kfkc1.cp-internal-onecloud-kfkc1 found.  at 
java.base/sun.security.ssl.Alert.createSSLException(Alert.java:131)  at 
java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:378) at 
java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:321) at 
java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:316) at 
java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1351)
  at 
java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.onConsumeCertificate(CertificateMessage.java:1226)
      at 
java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.consume(CertificateMessage.java:1169)
   at java.base/sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:396)    
   at 
java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:480) 
     at 
java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1277)
  at 
java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1264)
  at 
java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
     at 
java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask.run(SSLEngineImpl.java:1209)
  at 
org.apache.kafka.common.network.SslTransportLayer.runDelegatedTasks(SslTransportLayer.java:435)
      at 
org.apache.kafka.common.network.SslTransportLayer.handshakeUnwrap(SslTransportLayer.java:523)
        at 
org.apache.kafka.common.network.SslTransportLayer.doHandshake(SslTransportLayer.java:373)
    at 
org.apache.kafka.common.network.SslTransportLayer.handshake(SslTransportLayer.java:293)
      at 
org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:178)  at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:543)   
     at org.apache.kafka.common.network.Selector.poll(Selector.java:481)     at 
org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:585)  at 
org.apache.kafka.server.util.InterBrokerSendThread.pollOnce(InterBrokerSendThread.java:109)
  at 
kafka.server.NodeToControllerRequestThread.doWork(NodeToControllerChannelManager.scala:382)
  at 
org.apache.kafka.server.util.ShutdownableThread.run(ShutdownableThread.java:131)Caused
 by: java.security.cert.CertificateException: No subject alternative DNS name 
matching cp-internal-onecloud-kfkc1.cp-internal-onecloud-kfkc1 found.    at 
java.base/sun.security.util.HostnameChecker.matchDNS(HostnameChecker.java:212)  
     at 
java.base/sun.security.util.HostnameChecker.match(HostnameChecker.java:103)  at 
java.base/sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:458)
 at 
java.base/sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:418)
 at 
java.base/sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:292)
  at 
java.base/sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:144)
    at 
java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1329){noformat}
Queries:
1. Given IPs for communication and IPs as SANs. Why does inter broker 
communication works fine but not broker-controller and controller-controller? 
2. Why Is controller doing reverse DNS lookup? Is there a way to disable that? 

Note: we do not wish to set KAFKA_CFG_SSL_ENDPOINT_IDENTIFICATION_ALGORITHM=" " 
as it would disable IP matching as well, per our understanding.

Please let me know if you would like to know about any other configuration and 
logs. 

 


> Kafka Broker fails to connect to Kraft Controller with no DNS matching 
> -----------------------------------------------------------------------
>
>                 Key: KAFKA-16820
>                 URL: https://issues.apache.org/jira/browse/KAFKA-16820
>             Project: Kafka
>          Issue Type: Bug
>          Components: kraft
>    Affects Versions: 3.7.0, 3.6.1, 3.8.0
>            Reporter: Arushi Helms
>            Priority: Major
>         Attachments: Screenshot 2024-05-22 at 1.09.11 PM-1.png
>
>
>  
> We are migrating our Kafka cluster from zookeeper to Kraft mode. We are 
> running individual brokers and controllers with TLS enabled and IPs are given 
> for communication. 
> TLS enabled setup works fine among the brokers and the certificate looks 
> something like:
> {noformat}
> Common Name: *.kafka.service.consul
> Subject Alternative Names: *.kafka.service.consul, IP 
> Address:10.87.171.84{noformat}
> Note:
>  * The DNS name for the node does not match the CN but since we are using IPs 
> as communication, we have provided IPs as SAN.
>  * Same with the controllers, IPs are given as SAN in the certificate. 
>  * Issue is not related to the migration so just sharing configuration 
> relevant for the TLS piece. 
> In the current setup I am running 3 brokers and 3 controllers. 
> Relevant controller configurations from one of the controllers:
> {noformat}
> KAFKA_CFG_PROCESS_ROLES=controller 
> KAFKA_KRAFT_CLUSTER_ID=5kztjhJ4SxSu-kdiEYDUow
> KAFKA_CFG_NODE_ID=6 
> KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=4@10.87.170.83:9097,5@10.87.170.9:9097,6@10.87.170.6:9097
>  
> KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER 
> KAFKA_CFG_INTER_BROKER_LISTENER_NAME=INSIDE_SSL
> KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CONTROLLER:SSL,INSIDE_SSL:SSL 
> KAFKA_CFG_LISTENERS=CONTROLLER://10.87.170.6:9097{noformat}
> Relevant broker configuration from one of the brokers:
> {noformat}
> KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER
> KAFKA_CFG_INTER_BROKER_LISTENER_NAME=INSIDE_SSL
> KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=4@10.87.170.83:9097,5@10.87.170.9:9097,6@10.87.170.6:9097
>  
> KAFKA_CFG_PROCESS_ROLES=broker 
> KAFKA_CFG_NODE_ID=3 
> KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=INSIDE_SSL:SSL,OUTSIDE_SSL:SSL,CONTROLLER:SSL
>  
> KAFKA_CFG_LISTENERS=INSIDE_SSL://10.87.170.78:9093,OUTSIDE_SSL://10.87.170.78:9096
>  
> KAFKA_CFG_ADVERTISED_LISTENERS=INSIDE_SSL://10.87.170.78:9093,OUTSIDE_SSL://10.87.170.78:9096{noformat}
>  
> ISSUE 1: 
> With this setup Kafka broker is failing to connect to the controller, see the 
> following error:
> {noformat}
> 2024-05-22 17:53:46,413] ERROR 
> [broker-2-to-controller-heartbeat-channel-manager]: Request 
> BrokerRegistrationRequestData(brokerId=2, clusterId='5kztjhJ4SxSu-kdiEYDUow', 
> incarnationId=7741fgH6T4SQqGsho8E6mw, listeners=[Listener(name='INSIDE_SSL', 
> host='10.87.170.81', port=9093, securityProtocol=1), Listener(name='INSIDE', 
> host='10.87.170.81', port=9094, securityProtocol=0), Listener(name='OUTSIDE', 
> host='10.87.170.81', port=9092, securityProtocol=0), 
> Listener(name='OUTSIDE_SSL', host='10.87.170.81', port=9096, 
> securityProtocol=1)], features=[Feature(name='metadata.version', 
> minSupportedVersion=1, maxSupportedVersion=19)], rack=null, 
> isMigratingZkBroker=false, logDirs=[TJssfKDD-iBFYfIYCKOcew], 
> previousBrokerEpoch=-1) failed due to authentication error with controller 
> (kafka.server.NodeToControllerRequestThread)org.apache.kafka.common.errors.SslAuthenticationException:
>  SSL handshake failedCaused by: javax.net.ssl.SSLHandshakeException: No 
> subject alternative DNS name matching 
> cp-internal-onecloud-kfkc1.node.cp-internal-onecloud.consul found.        at 
> java.base/sun.security.ssl.Alert.createSSLException(Alert.java:131)  at 
> java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:378) 
> at 
> java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:321) 
> at 
> java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:316) 
> at 
> java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1351)
>   at 
> java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.onConsumeCertificate(CertificateMessage.java:1226)
>       at 
> java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.consume(CertificateMessage.java:1169)
>    at java.base/sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:396)  
>      at 
> java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:480)
>       at 
> java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1277)
>   at 
> java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1264)
>   at 
> java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
>      at 
> java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask.run(SSLEngineImpl.java:1209)
>   at 
> org.apache.kafka.common.network.SslTransportLayer.runDelegatedTasks(SslTransportLayer.java:435)
>       at 
> org.apache.kafka.common.network.SslTransportLayer.handshakeUnwrap(SslTransportLayer.java:523)
>         at 
> org.apache.kafka.common.network.SslTransportLayer.doHandshake(SslTransportLayer.java:373)
>     at 
> org.apache.kafka.common.network.SslTransportLayer.handshake(SslTransportLayer.java:293)
>       at 
> org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:178)  
> at 
> org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:543) 
>        at org.apache.kafka.common.network.Selector.poll(Selector.java:481)    
>  at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:585)  at 
> org.apache.kafka.server.util.InterBrokerSendThread.pollOnce(InterBrokerSendThread.java:109)
>   at 
> kafka.server.NodeToControllerRequestThread.doWork(NodeToControllerChannelManager.scala:382)
>   at 
> org.apache.kafka.server.util.ShutdownableThread.run(ShutdownableThread.java:131)Caused
>  by: java.security.cert.CertificateException: No subject alternative DNS name 
> matching cp-internal-onecloud-kfkc1.node.cp-internal-onecloud.consul found.   
>    at 
> java.base/sun.security.util.HostnameChecker.matchDNS(HostnameChecker.java:212)
>        at 
> java.base/sun.security.util.HostnameChecker.match(HostnameChecker.java:103)  
> at 
> java.base/sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:458)
>  at 
> java.base/sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:418)
>  at 
> java.base/sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:292)
>   at 
> java.base/sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:144)
>     at 
> java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1329)
>   ... 19 more{noformat}
>  
> ISSUE 2:
> Looks like kraft controller does the reverse DNS lookup for itself as well 
> while starting and we are seeing DNS name matching issue in the controller as 
> well. Log snippet from Controller with node ID 4:
> {noformat}
> [2024-05-16 20:57:07,962] INFO [SocketServer listenerType=CONTROLLER, 
> nodeId=4] Failed authentication with /10.87.170.83 
> (channelId=10.87.170.83:9097-10.87.170.83:42548-3) (SSL handshake failed) 
> (org.apache.kafka.common.network.Selector)[2024-05-16 20:57:11,118] INFO 
> [ControllerRegistrationManager id=4 incarnation=HWT3UBxJSPGuefZ9xdqH-g] 
> sendControllerRegistration: attempting to send 
> ControllerRegistrationRequestData(controllerId=4, 
> incarnationId=HWT3UBxJSPGuefZ9xdqH-g, zkMigrationReady=true, 
> listeners=[Listener(name='CONTROLLER', host='10.87.170.83', port=9097, 
> securityProtocol=1)], features=[Feature(name='metadata.version', 
> minSupportedVersion=1, maxSupportedVersion=19)]) 
> (kafka.server.ControllerRegistrationManager)[2024-05-16 20:57:11,129] INFO 
> [NodeToControllerChannelManager id=4 name=registration] Failed authentication 
> with cp-internal-onecloud-kfkc1.cp-internal-onecloud-kfkc1/10.87.170.83 
> (channelId=4) (SSL handshake failed) 
> (org.apache.kafka.common.network.Selector)[2024-05-16 20:57:11,130] INFO 
> [NodeToControllerChannelManager id=4 name=registration] Node 4 disconnected. 
> (org.apache.kafka.clients.NetworkClient)[2024-05-16 20:57:11,130] INFO 
> [SocketServer listenerType=CONTROLLER, nodeId=4] Failed authentication with 
> /10.87.170.83 (channelId=10.87.170.83:9097-10.87.170.83:42564-4) (SSL 
> handshake failed) (org.apache.kafka.common.network.Selector)[2024-05-16 
> 20:57:11,130] ERROR [NodeToControllerChannelManager id=4 name=registration] 
> Connection to node 4 
> (cp-internal-onecloud-kfkc1.cp-internal-onecloud-kfkc1/10.87.170.83:9097) 
> failed authentication due to: SSL handshake failed 
> (org.apache.kafka.clients.NetworkClient)[2024-05-16 20:57:11,131] ERROR 
> [controller-4-to-controller-registration-channel-manager]: Failed to send the 
> following request due to authentication error: 
> ClientRequest(expectResponse=true, 
> callback=kafka.server.NodeToControllerRequestThread$$Lambda$850/0x00007fee184be288@41a1ff51,
>  destination=4, correlationId=6, clientId=4, createdTimeMs=1715893031119, 
> requestBuilder=ControllerRegistrationRequestData(controllerId=4, 
> incarnationId=HWT3UBxJSPGuefZ9xdqH-g, zkMigrationReady=true, 
> listeners=[Listener(name='CONTROLLER', host='10.87.170.83', port=9097, 
> securityProtocol=1)], features=[Feature(name='metadata.version', 
> minSupportedVersion=1, maxSupportedVersion=19)])) 
> (kafka.server.NodeToControllerRequestThread)[2024-05-16 20:57:11,131] ERROR 
> [controller-4-to-controller-registration-channel-manager]: Request 
> ControllerRegistrationRequestData(controllerId=4, 
> incarnationId=HWT3UBxJSPGuefZ9xdqH-g, zkMigrationReady=true, 
> listeners=[Listener(name='CONTROLLER', host='10.87.170.83', port=9097, 
> securityProtocol=1)], features=[Feature(name='metadata.version', 
> minSupportedVersion=1, maxSupportedVersion=19)]) failed due to authentication 
> error with controller 
> (kafka.server.NodeToControllerRequestThread)org.apache.kafka.common.errors.SslAuthenticationException:
>  SSL handshake failedCaused by: javax.net.ssl.SSLHandshakeException: No 
> subject alternative DNS name matching 
> cp-internal-onecloud-kfkc1.cp-internal-onecloud-kfkc1 found.        at 
> java.base/sun.security.ssl.Alert.createSSLException(Alert.java:131)  at 
> java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:378) 
> at 
> java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:321) 
> at 
> java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:316) 
> at 
> java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1351)
>   at 
> java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.onConsumeCertificate(CertificateMessage.java:1226)
>       at 
> java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.consume(CertificateMessage.java:1169)
>    at java.base/sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:396)  
>      at 
> java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:480)
>       at 
> java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1277)
>   at 
> java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1264)
>   at 
> java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
>      at 
> java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask.run(SSLEngineImpl.java:1209)
>   at 
> org.apache.kafka.common.network.SslTransportLayer.runDelegatedTasks(SslTransportLayer.java:435)
>       at 
> org.apache.kafka.common.network.SslTransportLayer.handshakeUnwrap(SslTransportLayer.java:523)
>         at 
> org.apache.kafka.common.network.SslTransportLayer.doHandshake(SslTransportLayer.java:373)
>     at 
> org.apache.kafka.common.network.SslTransportLayer.handshake(SslTransportLayer.java:293)
>       at 
> org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:178)  
> at 
> org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:543) 
>        at org.apache.kafka.common.network.Selector.poll(Selector.java:481)    
>  at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:585)  at 
> org.apache.kafka.server.util.InterBrokerSendThread.pollOnce(InterBrokerSendThread.java:109)
>   at 
> kafka.server.NodeToControllerRequestThread.doWork(NodeToControllerChannelManager.scala:382)
>   at 
> org.apache.kafka.server.util.ShutdownableThread.run(ShutdownableThread.java:131)Caused
>  by: java.security.cert.CertificateException: No subject alternative DNS name 
> matching cp-internal-onecloud-kfkc1.cp-internal-onecloud-kfkc1 found.    at 
> java.base/sun.security.util.HostnameChecker.matchDNS(HostnameChecker.java:212)
>        at 
> java.base/sun.security.util.HostnameChecker.match(HostnameChecker.java:103)  
> at 
> java.base/sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:458)
>  at 
> java.base/sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:418)
>  at 
> java.base/sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:292)
>   at 
> java.base/sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:144)
>     at 
> java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1329){noformat}
> Queries:
> 1. Given IPs for communication and IPs as SANs. Why does inter broker 
> communication works fine but not broker-controller and controller-controller? 
> 2. Why Is controller doing reverse DNS lookup? Is there a way to disable 
> that? 
> Note: we do not wish to set KAFKA_CFG_SSL_ENDPOINT_IDENTIFICATION_ALGORITHM=" 
> " as it would disable IP matching as well, per our understanding.
> Please let me know if you would like to know about any other configuration 
> and logs. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to