[ 
https://issues.apache.org/jira/browse/KAFKA-16820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17848821#comment-17848821
 ] 

Vikash Mishra commented on KAFKA-16820:
---------------------------------------

Despite the fact that brokers are provided with controller IPs, it still tries 
to communicate using DNS of controller, refer below error. This is a different 
behavior without Kraft where inter-broker communication when provided with IPs, 
uses IPs to communicate and SSL handshake works using ip_san.
{code:java}
failed due to authentication error with controller 
(kafka.server.NodeToControllerRequestThread)org.apache.kafka.common.errors.SslAuthenticationException:
 SSL handshake failedCaused by: javax.net.ssl.SSLHandshakeException: No subject 
alternative DNS name matching 
cp-internal-onecloud-kfkc1.node.cp-internal-onecloud.consul found.{code}
Is there a configuration to disable & do IPs based communication between broker 
& controller when IPs are provided during bootstrap. or is this the default 
behavior going forward?

looping in [~soarez] [~jlprat] release managers of 3.7.1 & 3.8.0 to confirm if 
the reported issue has already been identified in upcoming releases and a known 
issue?

> Kafka Broker fails to connect to Kraft Controller with no DNS matching 
> -----------------------------------------------------------------------
>
>                 Key: KAFKA-16820
>                 URL: https://issues.apache.org/jira/browse/KAFKA-16820
>             Project: Kafka
>          Issue Type: Bug
>          Components: kraft
>    Affects Versions: 3.7.0, 3.6.1, 3.8.0
>            Reporter: Arushi Helms
>            Priority: Major
>         Attachments: Screenshot 2024-05-22 at 1.09.11 PM-1.png
>
>
>  
> We are migrating our Kafka cluster from zookeeper to Kraft mode. We are 
> running individual brokers and controllers with TLS enabled and IPs are given 
> for communication. 
> TLS enabled setup works fine among the brokers and the certificate looks 
> something like:
> {noformat}
> Common Name: *.kafka.service.consul
> Subject Alternative Names: *.kafka.service.consul, IP 
> Address:10.87.171.84{noformat}
> Note:
>  * The DNS name for the node does not match the CN but since we are using IPs 
> as communication, we have provided IPs as SAN.
>  * Same with the controllers, IPs are given as SAN in the certificate. 
>  * Issue is not related to the migration so just sharing configuration 
> relevant for the TLS piece. 
> In the current setup I am running 3 brokers and 3 controllers. 
> Relevant controller configurations from one of the controllers:
> {noformat}
> KAFKA_CFG_PROCESS_ROLES=controller 
> KAFKA_KRAFT_CLUSTER_ID=5kztjhJ4SxSu-kdiEYDUow
> KAFKA_CFG_NODE_ID=6 
> KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=4@10.87.170.83:9097,5@10.87.170.9:9097,6@10.87.170.6:9097
>  
> KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER 
> KAFKA_CFG_INTER_BROKER_LISTENER_NAME=INSIDE_SSL
> KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CONTROLLER:SSL,INSIDE_SSL:SSL 
> KAFKA_CFG_LISTENERS=CONTROLLER://10.87.170.6:9097{noformat}
> Relevant broker configuration from one of the brokers:
> {noformat}
> KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER
> KAFKA_CFG_INTER_BROKER_LISTENER_NAME=INSIDE_SSL
> KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=4@10.87.170.83:9097,5@10.87.170.9:9097,6@10.87.170.6:9097
>  
> KAFKA_CFG_PROCESS_ROLES=broker 
> KAFKA_CFG_NODE_ID=3 
> KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=INSIDE_SSL:SSL,OUTSIDE_SSL:SSL,CONTROLLER:SSL
>  
> KAFKA_CFG_LISTENERS=INSIDE_SSL://10.87.170.78:9093,OUTSIDE_SSL://10.87.170.78:9096
>  
> KAFKA_CFG_ADVERTISED_LISTENERS=INSIDE_SSL://10.87.170.78:9093,OUTSIDE_SSL://10.87.170.78:9096{noformat}
>  
> ISSUE 1: 
> With this setup Kafka broker is failing to connect to the controller, see the 
> following error:
> {noformat}
> 2024-05-22 17:53:46,413] ERROR 
> [broker-2-to-controller-heartbeat-channel-manager]: Request 
> BrokerRegistrationRequestData(brokerId=2, clusterId='5kztjhJ4SxSu-kdiEYDUow', 
> incarnationId=7741fgH6T4SQqGsho8E6mw, listeners=[Listener(name='INSIDE_SSL', 
> host='10.87.170.81', port=9093, securityProtocol=1), Listener(name='INSIDE', 
> host='10.87.170.81', port=9094, securityProtocol=0), Listener(name='OUTSIDE', 
> host='10.87.170.81', port=9092, securityProtocol=0), 
> Listener(name='OUTSIDE_SSL', host='10.87.170.81', port=9096, 
> securityProtocol=1)], features=[Feature(name='metadata.version', 
> minSupportedVersion=1, maxSupportedVersion=19)], rack=null, 
> isMigratingZkBroker=false, logDirs=[TJssfKDD-iBFYfIYCKOcew], 
> previousBrokerEpoch=-1) failed due to authentication error with controller 
> (kafka.server.NodeToControllerRequestThread)org.apache.kafka.common.errors.SslAuthenticationException:
>  SSL handshake failedCaused by: javax.net.ssl.SSLHandshakeException: No 
> subject alternative DNS name matching 
> cp-internal-onecloud-kfkc1.node.cp-internal-onecloud.consul found.        at 
> java.base/sun.security.ssl.Alert.createSSLException(Alert.java:131)  at 
> java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:378) 
> at 
> java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:321) 
> at 
> java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:316) 
> at 
> java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1351)
>   at 
> java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.onConsumeCertificate(CertificateMessage.java:1226)
>       at 
> java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.consume(CertificateMessage.java:1169)
>    at java.base/sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:396)  
>      at 
> java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:480)
>       at 
> java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1277)
>   at 
> java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1264)
>   at 
> java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
>      at 
> java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask.run(SSLEngineImpl.java:1209)
>   at 
> org.apache.kafka.common.network.SslTransportLayer.runDelegatedTasks(SslTransportLayer.java:435)
>       at 
> org.apache.kafka.common.network.SslTransportLayer.handshakeUnwrap(SslTransportLayer.java:523)
>         at 
> org.apache.kafka.common.network.SslTransportLayer.doHandshake(SslTransportLayer.java:373)
>     at 
> org.apache.kafka.common.network.SslTransportLayer.handshake(SslTransportLayer.java:293)
>       at 
> org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:178)  
> at 
> org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:543) 
>        at org.apache.kafka.common.network.Selector.poll(Selector.java:481)    
>  at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:585)  at 
> org.apache.kafka.server.util.InterBrokerSendThread.pollOnce(InterBrokerSendThread.java:109)
>   at 
> kafka.server.NodeToControllerRequestThread.doWork(NodeToControllerChannelManager.scala:382)
>   at 
> org.apache.kafka.server.util.ShutdownableThread.run(ShutdownableThread.java:131)Caused
>  by: java.security.cert.CertificateException: No subject alternative DNS name 
> matching cp-internal-onecloud-kfkc1.node.cp-internal-onecloud.consul found.   
>    at 
> java.base/sun.security.util.HostnameChecker.matchDNS(HostnameChecker.java:212)
>        at 
> java.base/sun.security.util.HostnameChecker.match(HostnameChecker.java:103)  
> at 
> java.base/sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:458)
>  at 
> java.base/sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:418)
>  at 
> java.base/sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:292)
>   at 
> java.base/sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:144)
>     at 
> java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1329)
>   ... 19 more{noformat}
>  
> ISSUE 2:
> Looks like kraft controller does the reverse DNS lookup for itself as well 
> while starting and we are seeing DNS name matching issue in the controller as 
> well. Log snippet from Controller with node ID 4:
> {noformat}
> [2024-05-16 20:57:07,962] INFO [SocketServer listenerType=CONTROLLER, 
> nodeId=4] Failed authentication with /10.87.170.83 
> (channelId=10.87.170.83:9097-10.87.170.83:42548-3) (SSL handshake failed) 
> (org.apache.kafka.common.network.Selector)[2024-05-16 20:57:11,118] INFO 
> [ControllerRegistrationManager id=4 incarnation=HWT3UBxJSPGuefZ9xdqH-g] 
> sendControllerRegistration: attempting to send 
> ControllerRegistrationRequestData(controllerId=4, 
> incarnationId=HWT3UBxJSPGuefZ9xdqH-g, zkMigrationReady=true, 
> listeners=[Listener(name='CONTROLLER', host='10.87.170.83', port=9097, 
> securityProtocol=1)], features=[Feature(name='metadata.version', 
> minSupportedVersion=1, maxSupportedVersion=19)]) 
> (kafka.server.ControllerRegistrationManager)[2024-05-16 20:57:11,129] INFO 
> [NodeToControllerChannelManager id=4 name=registration] Failed authentication 
> with cp-internal-onecloud-kfkc1.cp-internal-onecloud-kfkc1/10.87.170.83 
> (channelId=4) (SSL handshake failed) 
> (org.apache.kafka.common.network.Selector)[2024-05-16 20:57:11,130] INFO 
> [NodeToControllerChannelManager id=4 name=registration] Node 4 disconnected. 
> (org.apache.kafka.clients.NetworkClient)[2024-05-16 20:57:11,130] INFO 
> [SocketServer listenerType=CONTROLLER, nodeId=4] Failed authentication with 
> /10.87.170.83 (channelId=10.87.170.83:9097-10.87.170.83:42564-4) (SSL 
> handshake failed) (org.apache.kafka.common.network.Selector)[2024-05-16 
> 20:57:11,130] ERROR [NodeToControllerChannelManager id=4 name=registration] 
> Connection to node 4 
> (cp-internal-onecloud-kfkc1.cp-internal-onecloud-kfkc1/10.87.170.83:9097) 
> failed authentication due to: SSL handshake failed 
> (org.apache.kafka.clients.NetworkClient)[2024-05-16 20:57:11,131] ERROR 
> [controller-4-to-controller-registration-channel-manager]: Failed to send the 
> following request due to authentication error: 
> ClientRequest(expectResponse=true, 
> callback=kafka.server.NodeToControllerRequestThread$$Lambda$850/0x00007fee184be288@41a1ff51,
>  destination=4, correlationId=6, clientId=4, createdTimeMs=1715893031119, 
> requestBuilder=ControllerRegistrationRequestData(controllerId=4, 
> incarnationId=HWT3UBxJSPGuefZ9xdqH-g, zkMigrationReady=true, 
> listeners=[Listener(name='CONTROLLER', host='10.87.170.83', port=9097, 
> securityProtocol=1)], features=[Feature(name='metadata.version', 
> minSupportedVersion=1, maxSupportedVersion=19)])) 
> (kafka.server.NodeToControllerRequestThread)[2024-05-16 20:57:11,131] ERROR 
> [controller-4-to-controller-registration-channel-manager]: Request 
> ControllerRegistrationRequestData(controllerId=4, 
> incarnationId=HWT3UBxJSPGuefZ9xdqH-g, zkMigrationReady=true, 
> listeners=[Listener(name='CONTROLLER', host='10.87.170.83', port=9097, 
> securityProtocol=1)], features=[Feature(name='metadata.version', 
> minSupportedVersion=1, maxSupportedVersion=19)]) failed due to authentication 
> error with controller 
> (kafka.server.NodeToControllerRequestThread)org.apache.kafka.common.errors.SslAuthenticationException:
>  SSL handshake failedCaused by: javax.net.ssl.SSLHandshakeException: No 
> subject alternative DNS name matching 
> cp-internal-onecloud-kfkc1.cp-internal-onecloud-kfkc1 found.        at 
> java.base/sun.security.ssl.Alert.createSSLException(Alert.java:131)  at 
> java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:378) 
> at 
> java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:321) 
> at 
> java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:316) 
> at 
> java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1351)
>   at 
> java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.onConsumeCertificate(CertificateMessage.java:1226)
>       at 
> java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.consume(CertificateMessage.java:1169)
>    at java.base/sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:396)  
>      at 
> java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:480)
>       at 
> java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1277)
>   at 
> java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1264)
>   at 
> java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
>      at 
> java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask.run(SSLEngineImpl.java:1209)
>   at 
> org.apache.kafka.common.network.SslTransportLayer.runDelegatedTasks(SslTransportLayer.java:435)
>       at 
> org.apache.kafka.common.network.SslTransportLayer.handshakeUnwrap(SslTransportLayer.java:523)
>         at 
> org.apache.kafka.common.network.SslTransportLayer.doHandshake(SslTransportLayer.java:373)
>     at 
> org.apache.kafka.common.network.SslTransportLayer.handshake(SslTransportLayer.java:293)
>       at 
> org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:178)  
> at 
> org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:543) 
>        at org.apache.kafka.common.network.Selector.poll(Selector.java:481)    
>  at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:585)  at 
> org.apache.kafka.server.util.InterBrokerSendThread.pollOnce(InterBrokerSendThread.java:109)
>   at 
> kafka.server.NodeToControllerRequestThread.doWork(NodeToControllerChannelManager.scala:382)
>   at 
> org.apache.kafka.server.util.ShutdownableThread.run(ShutdownableThread.java:131)Caused
>  by: java.security.cert.CertificateException: No subject alternative DNS name 
> matching cp-internal-onecloud-kfkc1.cp-internal-onecloud-kfkc1 found.    at 
> java.base/sun.security.util.HostnameChecker.matchDNS(HostnameChecker.java:212)
>        at 
> java.base/sun.security.util.HostnameChecker.match(HostnameChecker.java:103)  
> at 
> java.base/sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:458)
>  at 
> java.base/sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:418)
>  at 
> java.base/sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:292)
>   at 
> java.base/sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:144)
>     at 
> java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1329){noformat}
> Queries:
> 1. Given IPs for communication and IPs as SANs. Why does inter broker 
> communication works fine but not broker-controller and controller-controller? 
> 2. Why Is controller doing reverse DNS lookup? Is there a way to disable 
> that? 
> Note: we do not wish to set KAFKA_CFG_SSL_ENDPOINT_IDENTIFICATION_ALGORITHM=" 
> " as it would disable IP matching as well, per our understanding.
> Please let me know if you would like to know about any other configuration 
> and logs. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to