[ https://issues.apache.org/jira/browse/KAFKA-16820 ]
Arushi Helms deleted comment on KAFKA-16820:
--------------------------------------
was (Author: JIRAUSER305554):
Hi [~soarez]
Thanks for your response.
We have
> Kafka Broker fails to connect to Kraft Controller with no DNS matching
> -----------------------------------------------------------------------
>
> Key: KAFKA-16820
> URL: https://issues.apache.org/jira/browse/KAFKA-16820
> Project: Kafka
> Issue Type: Bug
> Components: kraft
> Affects Versions: 3.7.0, 3.6.1, 3.8.0
> Reporter: Arushi Helms
> Priority: Major
> Attachments: Screenshot 2024-05-22 at 1.09.11 PM-1.png
>
>
>
> We are migrating our Kafka cluster from zookeeper to Kraft mode. We are
> running individual brokers and controllers with TLS enabled and IPs are given
> for communication.
> TLS enabled setup works fine among the brokers and the certificate looks
> something like:
> {noformat}
> Common Name: *.kafka.service.consul
> Subject Alternative Names: *.kafka.service.consul, IP
> Address:10.87.170.78{noformat}
> Note:
> * The DNS name for the node does not match the CN but since we are using IPs
> as communication, we have provided IPs as SAN.
> * Same with the controllers, IPs are given as SAN in the certificate.
> * Issue is not related to the migration so just sharing configuration
> relevant for the TLS piece.
> In the current setup I am running 3 brokers and 3 controllers.
> *CONTROLLER:*
> Relevant controller configurations from one of the controllers:
> {noformat}
> KAFKA_CFG_PROCESS_ROLES=controller
> KAFKA_KRAFT_CLUSTER_ID=5kztjhJ4SxSu-kdiEYDUow
> KAFKA_CFG_NODE_ID=6
> [email protected]:9097,[email protected]:9097,[email protected]:9097
>
> KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER
> KAFKA_CFG_INTER_BROKER_LISTENER_NAME=INSIDE_SSL
> KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CONTROLLER:SSL,INSIDE_SSL:SSL
> KAFKA_CFG_LISTENERS=CONTROLLER://10.87.170.6:9097{noformat}
> Controller certificate has:
> {noformat}
> Common Name: *.kafka.service.consul
> Subject Alternative Names: *.kafka.service.consul, IP
> Address:10.87.170.6{noformat}
>
> *BROKER:*
> Relevant broker configuration from one of the brokers:
> {noformat}
> KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER
> KAFKA_CFG_INTER_BROKER_LISTENER_NAME=INSIDE_SSL
> [email protected]:9097,[email protected]:9097,[email protected]:9097
>
> KAFKA_CFG_PROCESS_ROLES=broker
> KAFKA_CFG_NODE_ID=3
> KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=INSIDE_SSL:SSL,OUTSIDE_SSL:SSL,CONTROLLER:SSL
>
> KAFKA_CFG_LISTENERS=INSIDE_SSL://10.87.170.78:9093,OUTSIDE_SSL://10.87.170.78:9096
>
> KAFKA_CFG_ADVERTISED_LISTENERS=INSIDE_SSL://10.87.170.78:9093,OUTSIDE_SSL://10.87.170.78:9096{noformat}
> Broker certificate has:
> {noformat}
> Common Name: *.kafka.service.consul
> Subject Alternative Names: *.kafka.service.consul, IP
> Address:10.87.170.78{noformat}
>
> ISSUE 1:
> With this setup Kafka broker is failing to connect to the controller, see the
> following error:
> {noformat}
> 2024-05-22 17:53:46,413] ERROR
> [broker-2-to-controller-heartbeat-channel-manager]: Request
> BrokerRegistrationRequestData(brokerId=2, clusterId='5kztjhJ4SxSu-kdiEYDUow',
> incarnationId=7741fgH6T4SQqGsho8E6mw, listeners=[Listener(name='INSIDE_SSL',
> host='10.87.170.81', port=9093, securityProtocol=1), Listener(name='INSIDE',
> host='10.87.170.81', port=9094, securityProtocol=0), Listener(name='OUTSIDE',
> host='10.87.170.81', port=9092, securityProtocol=0),
> Listener(name='OUTSIDE_SSL', host='10.87.170.81', port=9096,
> securityProtocol=1)], features=[Feature(name='metadata.version',
> minSupportedVersion=1, maxSupportedVersion=19)], rack=null,
> isMigratingZkBroker=false, logDirs=[TJssfKDD-iBFYfIYCKOcew],
> previousBrokerEpoch=-1) failed due to authentication error with controller
> (kafka.server.NodeToControllerRequestThread)org.apache.kafka.common.errors.SslAuthenticationException:
> SSL handshake failedCaused by: javax.net.ssl.SSLHandshakeException: No
> subject alternative DNS name matching
> cp-internal-onecloud-kfkc1.node.cp-internal-onecloud.consul found. at
> java.base/sun.security.ssl.Alert.createSSLException(Alert.java:131) at
> java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:378)
> at
> java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:321)
> at
> java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:316)
> at
> java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1351)
> at
> java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.onConsumeCertificate(CertificateMessage.java:1226)
> at
> java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.consume(CertificateMessage.java:1169)
> at java.base/sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:396)
> at
> java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:480)
> at
> java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1277)
> at
> java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1264)
> at
> java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
> at
> java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask.run(SSLEngineImpl.java:1209)
> at
> org.apache.kafka.common.network.SslTransportLayer.runDelegatedTasks(SslTransportLayer.java:435)
> at
> org.apache.kafka.common.network.SslTransportLayer.handshakeUnwrap(SslTransportLayer.java:523)
> at
> org.apache.kafka.common.network.SslTransportLayer.doHandshake(SslTransportLayer.java:373)
> at
> org.apache.kafka.common.network.SslTransportLayer.handshake(SslTransportLayer.java:293)
> at
> org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:178)
> at
> org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:543)
> at org.apache.kafka.common.network.Selector.poll(Selector.java:481)
> at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:585) at
> org.apache.kafka.server.util.InterBrokerSendThread.pollOnce(InterBrokerSendThread.java:109)
> at
> kafka.server.NodeToControllerRequestThread.doWork(NodeToControllerChannelManager.scala:382)
> at
> org.apache.kafka.server.util.ShutdownableThread.run(ShutdownableThread.java:131)Caused
> by: java.security.cert.CertificateException: No subject alternative DNS name
> matching cp-internal-onecloud-kfkc1.node.cp-internal-onecloud.consul found.
> at
> java.base/sun.security.util.HostnameChecker.matchDNS(HostnameChecker.java:212)
> at
> java.base/sun.security.util.HostnameChecker.match(HostnameChecker.java:103)
> at
> java.base/sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:458)
> at
> java.base/sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:418)
> at
> java.base/sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:292)
> at
> java.base/sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:144)
> at
> java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1329)
> ... 19 more{noformat}
>
> ISSUE 2:
> Looks like kraft controller does the reverse DNS lookup for itself as well
> while starting and we are seeing DNS name matching issue in the controller as
> well. Log snippet from Controller with node ID 4:
> {noformat}
> [2024-05-16 20:57:07,962] INFO [SocketServer listenerType=CONTROLLER,
> nodeId=4] Failed authentication with /10.87.170.83
> (channelId=10.87.170.83:9097-10.87.170.83:42548-3) (SSL handshake failed)
> (org.apache.kafka.common.network.Selector)[2024-05-16 20:57:11,118] INFO
> [ControllerRegistrationManager id=4 incarnation=HWT3UBxJSPGuefZ9xdqH-g]
> sendControllerRegistration: attempting to send
> ControllerRegistrationRequestData(controllerId=4,
> incarnationId=HWT3UBxJSPGuefZ9xdqH-g, zkMigrationReady=true,
> listeners=[Listener(name='CONTROLLER', host='10.87.170.83', port=9097,
> securityProtocol=1)], features=[Feature(name='metadata.version',
> minSupportedVersion=1, maxSupportedVersion=19)])
> (kafka.server.ControllerRegistrationManager)[2024-05-16 20:57:11,129] INFO
> [NodeToControllerChannelManager id=4 name=registration] Failed authentication
> with cp-internal-onecloud-kfkc1.cp-internal-onecloud-kfkc1/10.87.170.83
> (channelId=4) (SSL handshake failed)
> (org.apache.kafka.common.network.Selector)[2024-05-16 20:57:11,130] INFO
> [NodeToControllerChannelManager id=4 name=registration] Node 4 disconnected.
> (org.apache.kafka.clients.NetworkClient)[2024-05-16 20:57:11,130] INFO
> [SocketServer listenerType=CONTROLLER, nodeId=4] Failed authentication with
> /10.87.170.83 (channelId=10.87.170.83:9097-10.87.170.83:42564-4) (SSL
> handshake failed) (org.apache.kafka.common.network.Selector)[2024-05-16
> 20:57:11,130] ERROR [NodeToControllerChannelManager id=4 name=registration]
> Connection to node 4
> (cp-internal-onecloud-kfkc1.cp-internal-onecloud-kfkc1/10.87.170.83:9097)
> failed authentication due to: SSL handshake failed
> (org.apache.kafka.clients.NetworkClient)[2024-05-16 20:57:11,131] ERROR
> [controller-4-to-controller-registration-channel-manager]: Failed to send the
> following request due to authentication error:
> ClientRequest(expectResponse=true,
> callback=kafka.server.NodeToControllerRequestThread$$Lambda$850/0x00007fee184be288@41a1ff51,
> destination=4, correlationId=6, clientId=4, createdTimeMs=1715893031119,
> requestBuilder=ControllerRegistrationRequestData(controllerId=4,
> incarnationId=HWT3UBxJSPGuefZ9xdqH-g, zkMigrationReady=true,
> listeners=[Listener(name='CONTROLLER', host='10.87.170.83', port=9097,
> securityProtocol=1)], features=[Feature(name='metadata.version',
> minSupportedVersion=1, maxSupportedVersion=19)]))
> (kafka.server.NodeToControllerRequestThread)[2024-05-16 20:57:11,131] ERROR
> [controller-4-to-controller-registration-channel-manager]: Request
> ControllerRegistrationRequestData(controllerId=4,
> incarnationId=HWT3UBxJSPGuefZ9xdqH-g, zkMigrationReady=true,
> listeners=[Listener(name='CONTROLLER', host='10.87.170.83', port=9097,
> securityProtocol=1)], features=[Feature(name='metadata.version',
> minSupportedVersion=1, maxSupportedVersion=19)]) failed due to authentication
> error with controller
> (kafka.server.NodeToControllerRequestThread)org.apache.kafka.common.errors.SslAuthenticationException:
> SSL handshake failedCaused by: javax.net.ssl.SSLHandshakeException: No
> subject alternative DNS name matching
> cp-internal-onecloud-kfkc1.cp-internal-onecloud-kfkc1 found. at
> java.base/sun.security.ssl.Alert.createSSLException(Alert.java:131) at
> java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:378)
> at
> java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:321)
> at
> java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:316)
> at
> java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1351)
> at
> java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.onConsumeCertificate(CertificateMessage.java:1226)
> at
> java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.consume(CertificateMessage.java:1169)
> at java.base/sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:396)
> at
> java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:480)
> at
> java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1277)
> at
> java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1264)
> at
> java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
> at
> java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask.run(SSLEngineImpl.java:1209)
> at
> org.apache.kafka.common.network.SslTransportLayer.runDelegatedTasks(SslTransportLayer.java:435)
> at
> org.apache.kafka.common.network.SslTransportLayer.handshakeUnwrap(SslTransportLayer.java:523)
> at
> org.apache.kafka.common.network.SslTransportLayer.doHandshake(SslTransportLayer.java:373)
> at
> org.apache.kafka.common.network.SslTransportLayer.handshake(SslTransportLayer.java:293)
> at
> org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:178)
> at
> org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:543)
> at org.apache.kafka.common.network.Selector.poll(Selector.java:481)
> at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:585) at
> org.apache.kafka.server.util.InterBrokerSendThread.pollOnce(InterBrokerSendThread.java:109)
> at
> kafka.server.NodeToControllerRequestThread.doWork(NodeToControllerChannelManager.scala:382)
> at
> org.apache.kafka.server.util.ShutdownableThread.run(ShutdownableThread.java:131)Caused
> by: java.security.cert.CertificateException: No subject alternative DNS name
> matching cp-internal-onecloud-kfkc1.cp-internal-onecloud-kfkc1 found. at
> java.base/sun.security.util.HostnameChecker.matchDNS(HostnameChecker.java:212)
> at
> java.base/sun.security.util.HostnameChecker.match(HostnameChecker.java:103)
> at
> java.base/sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:458)
> at
> java.base/sun.security.ssl.X509TrustManagerImpl.checkIdentity(X509TrustManagerImpl.java:418)
> at
> java.base/sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:292)
> at
> java.base/sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:144)
> at
> java.base/sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1329){noformat}
> Queries:
> 1. Given IPs for communication and IPs as SANs. Why does inter broker
> communication works fine but not broker-controller and controller-controller?
> 2. Why Is controller doing reverse DNS lookup? Is there a way to disable
> that?
> Note: we do not wish to set KAFKA_CFG_SSL_ENDPOINT_IDENTIFICATION_ALGORITHM="
> " as it would disable IP matching as well, per our understanding.
> Please let me know if you would like to know about any other configuration
> and logs.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)