Bharath B created ZOOKEEPER-3882:
------------------------------------

             Summary: Zookeeper quorum formation fails when TLS is enabled in 
k8s env
                 Key: ZOOKEEPER-3882
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3882
             Project: ZooKeeper
          Issue Type: Bug
          Components: leaderElection, quorum
    Affects Versions: 3.5.7
         Environment: *Configurations set in zookeeper.properties* 

maxClientCnxns=0

serverCnxnFactory=org.apache.zookeeper.server.NettyServerCnxnFactory
clientCnxnSocket=org.apache.zookeeper.ClientCnxnSocketNetty
secureClientPort=2181
dataDir=<data_dir_path>
dataLogDir=<data_log_dir_path>
ssl.protocol=TLSv1.2
ssl.keyStore.location=<keystore_path>
ssl.keyStore.password=<password>
ssl.keyStore.type=JKS
ssl.trustStore.location=<truststore_path>
ssl.trustStore.password=<password>
ssl.trustStore.type=JKS
server.1=zookeeper1:2888:3888
server.2=zookeeper2:2888:3888
server.3=zookeeper3:2888:3888
initLimit=5
syncLimit=2
sslQuorum=true
ssl.quorum.keyStore.location=<keystore_path>
ssl.quorum.keyStore.password=<password>
ssl.quorum.keyStore.type=JKS
ssl.quorum.trustStore.location=<truststore_path>
ssl.quorum.trustStore.password=<password>
ssl.quorum.trustStore.type=JKS
ssl.quorum.protocol=TLSv1.2

*Logs*

[2020-07-07 20:28:31,373] INFO Reading configuration from: 
/opt/zookeeper/config/zookeeper.properties 
(org.apache.zookeeper.server.quorum.QuorumPeerConfig)
[2020-07-07 20:28:31,375] INFO clientPort is not set 
(org.apache.zookeeper.server.quorum.QuorumPeerConfig)
[2020-07-07 20:28:31,381] INFO secureClientPortAddress is 0.0.0.0:2181 
(org.apache.zookeeper.server.quorum.QuorumPeerConfig)
[2020-07-07 20:28:31,390] INFO Setting -D 
jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS 
renegotiation (org.apache.zookeeper.common.X509Util)
[2020-07-07 20:28:31,404] INFO autopurge.snapRetainCount set to 3 
(org.apache.zookeeper.server.DatadirCleanupManager)
[2020-07-07 20:28:31,404] INFO autopurge.purgeInterval set to 0 
(org.apache.zookeeper.server.DatadirCleanupManager)
[2020-07-07 20:28:31,404] INFO Purge task is not scheduled. 
(org.apache.zookeeper.server.DatadirCleanupManager)
[2020-07-07 20:28:31,405] INFO Log4j found with jmx enabled. 
(org.apache.zookeeper.jmx.ManagedUtil)
[2020-07-07 20:28:31,422] INFO Starting quorum peer 
(org.apache.zookeeper.server.quorum.QuorumPeerMain)
[2020-07-07 20:28:31,525] INFO zookeeper.client.portUnification=false 
(org.apache.zookeeper.server.NettyServerCnxnFactory)
[2020-07-07 20:28:31,607] INFO Using 
org.apache.zookeeper.server.NettyServerCnxnFactory as server connection factory 
(org.apache.zookeeper.server.ServerCnxnFactory)
[2020-07-07 20:28:31,619] INFO zookeeper.snapshot.trust.empty : false 
(org.apache.zookeeper.server.persistence.FileTxnSnapLog)
[2020-07-07 20:28:31,629] INFO Local sessions disabled 
(org.apache.zookeeper.server.quorum.QuorumPeer)
[2020-07-07 20:28:31,629] INFO Local session upgrading disabled 
(org.apache.zookeeper.server.quorum.QuorumPeer)
[2020-07-07 20:28:31,630] INFO tickTime set to 3000 
(org.apache.zookeeper.server.quorum.QuorumPeer)
[2020-07-07 20:28:31,630] INFO minSessionTimeout set to 6000 
(org.apache.zookeeper.server.quorum.QuorumPeer)
[2020-07-07 20:28:31,630] INFO maxSessionTimeout set to 60000 
(org.apache.zookeeper.server.quorum.QuorumPeer)
[2020-07-07 20:28:31,630] INFO initLimit set to 5 
(org.apache.zookeeper.server.quorum.QuorumPeer)
[2020-07-07 20:28:31,654] INFO zookeeper.snapshotSizeFactor = 0.33 
(org.apache.zookeeper.server.ZKDatabase)
[2020-07-07 20:28:31,660] INFO Using TLS encrypted quorum communication 
(org.apache.zookeeper.server.quorum.QuorumPeer)
[2020-07-07 20:28:31,660] INFO Port unification disabled 
(org.apache.zookeeper.server.quorum.QuorumPeer)
[2020-07-07 20:28:31,660] INFO QuorumPeer communication is not secured! (SASL 
auth disabled) (org.apache.zookeeper.server.quorum.QuorumPeer)
[2020-07-07 20:28:31,660] INFO quorum.cnxn.threads.size set to 20 
(org.apache.zookeeper.server.quorum.QuorumPeer)
[2020-07-07 20:28:31,672] INFO Snapshotting: 0x0 to 
/opt/zookeeper/data/version-2/snapshot.0 
(org.apache.zookeeper.server.persistence.FileTxnSnapLog)
[2020-07-07 20:28:31,676] INFO currentEpoch not found! Creating with a 
reasonable default of 0. This should only happen when you are upgrading your 
installation (org.apache.zookeeper.server.quorum.QuorumPeer)
[2020-07-07 20:28:31,706] INFO acceptedEpoch not found! Creating with a 
reasonable default of 0. This should only happen when you are upgrading your 
installation (org.apache.zookeeper.server.quorum.QuorumPeer)
[2020-07-07 20:28:31,717] INFO binding to port 0.0.0.0/0.0.0.0:2181 
(org.apache.zookeeper.server.NettyServerCnxnFactory)
[2020-07-07 20:28:31,827] INFO bound to port 2181 
(org.apache.zookeeper.server.NettyServerCnxnFactory)
[2020-07-07 20:28:31,831] INFO Election port bind maximum retries is 3 
(org.apache.zookeeper.server.quorum.QuorumCnxManager)
[2020-07-07 20:28:31,835] INFO Creating TLS-only quorum server socket 
(org.apache.zookeeper.server.quorum.QuorumCnxManager)
[2020-07-07 20:28:31,837] INFO My election bind port: 
zookeeper1/172.16.13.150:3888 
(org.apache.zookeeper.server.quorum.QuorumCnxManager)
[2020-07-07 20:28:31,846] INFO LOOKING 
(org.apache.zookeeper.server.quorum.QuorumPeer)
[2020-07-07 20:28:31,847] INFO New election. My id = 1, proposed zxid=0x0 
(org.apache.zookeeper.server.quorum.FastLeaderElection)
[2020-07-07 20:28:32,331] INFO Received connection request x.x.x.x:50278 
(org.apache.zookeeper.server.quorum.QuorumCnxManager)
[2020-07-07 20:28:32,742] ERROR Failed to verify host address: x.x.x.x 
(org.apache.zookeeper.common.ZKTrustManager)
javax.net.ssl.SSLPeerUnverifiedException: Certificate for <x.x.x.x> doesn't 
match any of the subject alternative names: [zookeeper, zookeeper1, zookeeper2, 
zookeeper3, zookeeper1.odim.svc.cluster.local, 
zookeeper2.odim.svc.cluster.local, zookeeper3.odim.svc.cluster.local]
 at 
org.apache.zookeeper.common.ZKHostnameVerifier.matchIPAddress(ZKHostnameVerifier.java:194)
 at 
org.apache.zookeeper.common.ZKHostnameVerifier.verify(ZKHostnameVerifier.java:164)
 at 
org.apache.zookeeper.common.ZKTrustManager.performHostVerification(ZKTrustManager.java:135)
 at 
org.apache.zookeeper.common.ZKTrustManager.checkClientTrusted(ZKTrustManager.java:74)
 at 
sun.security.ssl.ServerHandshaker.clientCertificate(ServerHandshaker.java:2037)
 at sun.security.ssl.ServerHandshaker.processMessage(ServerHandshaker.java:233)
 at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1082)
 at sun.security.ssl.Handshaker.process_record(Handshaker.java:1010)
 at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1079)
 at 
sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1388)
 at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1416)
 at sun.security.ssl.SSLSocketImpl.getSession(SSLSocketImpl.java:2309)
 at 
org.apache.zookeeper.server.quorum.UnifiedServerSocket$UnifiedSocket.detectMode(UnifiedServerSocket.java:273)
 at 
org.apache.zookeeper.server.quorum.UnifiedServerSocket$UnifiedSocket.getSocket(UnifiedServerSocket.java:301)
 at 
org.apache.zookeeper.server.quorum.UnifiedServerSocket$UnifiedSocket.access$400(UnifiedServerSocket.java:180)
 at 
org.apache.zookeeper.server.quorum.UnifiedServerSocket$UnifiedInputStream.getRealInputStream(UnifiedServerSocket.java:700)
 at 
org.apache.zookeeper.server.quorum.UnifiedServerSocket$UnifiedInputStream.read(UnifiedServerSocket.java:694)
 at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
 at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
 at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
 at java.io.DataInputStream.readFully(DataInputStream.java:195)
 at java.io.DataInputStream.readLong(DataInputStream.java:416)
 at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.handleConnection(QuorumCnxManager.java:524)
 at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.receiveConnection(QuorumCnxManager.java:478)
 at 
org.apache.zookeeper.server.quorum.QuorumCnxManager$Listener.run(QuorumCnxManager.java:934)
[2020-07-07 20:28:32,745] ERROR Failed to verify hostname: 
x-x-x-x.kubernetes.default.svc.cluster.local 
(org.apache.zookeeper.common.ZKTrustManager)
javax.net.ssl.SSLPeerUnverifiedException: Certificate for 
<x-x-x-x.kubernetes.default.svc.cluster.local> doesn't match any of the subject 
alternative names: [zookeeper, zookeeper1, zookeeper2, zookeeper3, 
zookeeper1.odim.svc.cluster.local, zookeeper2.odim.svc.cluster.local, 
zookeeper3.odim.svc.cluster.local]
 at 
org.apache.zookeeper.common.ZKHostnameVerifier.matchDNSName(ZKHostnameVerifier.java:224)
 at 
org.apache.zookeeper.common.ZKHostnameVerifier.verify(ZKHostnameVerifier.java:170)
 at 
org.apache.zookeeper.common.ZKTrustManager.performHostVerification(ZKTrustManager.java:141)
 at 
org.apache.zookeeper.common.ZKTrustManager.checkClientTrusted(ZKTrustManager.java:74)
 at 
sun.security.ssl.ServerHandshaker.clientCertificate(ServerHandshaker.java:2037)
 at sun.security.ssl.ServerHandshaker.processMessage(ServerHandshaker.java:233)
 at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1082)
 at sun.security.ssl.Handshaker.process_record(Handshaker.java:1010)
 at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1079)
 at 
sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1388)
 at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1416)
 at sun.security.ssl.SSLSocketImpl.getSession(SSLSocketImpl.java:2309)
 at 
org.apache.zookeeper.server.quorum.UnifiedServerSocket$UnifiedSocket.detectMode(UnifiedServerSocket.java:273)
 at 
org.apache.zookeeper.server.quorum.UnifiedServerSocket$UnifiedSocket.getSocket(UnifiedServerSocket.java:301)
 at 
org.apache.zookeeper.server.quorum.UnifiedServerSocket$UnifiedSocket.access$400(UnifiedServerSocket.java:180)
 at 
org.apache.zookeeper.server.quorum.UnifiedServerSocket$UnifiedInputStream.getRealInputStream(UnifiedServerSocket.java:700)
 at 
org.apache.zookeeper.server.quorum.UnifiedServerSocket$UnifiedInputStream.read(UnifiedServerSocket.java:694)
 at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
 at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
 at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
 at java.io.DataInputStream.readFully(DataInputStream.java:195)
 at java.io.DataInputStream.readLong(DataInputStream.java:416)
 at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.handleConnection(QuorumCnxManager.java:524)
 at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.receiveConnection(QuorumCnxManager.java:478)
 at 
org.apache.zookeeper.server.quorum.QuorumCnxManager$Listener.run(QuorumCnxManager.java:934)
[2020-07-07 20:28:32,747] INFO Accepted TLS connection from 
x-x-x-x.kubernetes.default.svc.cluster.local/x.x.x.x:50278 - NONE - 
SSL_NULL_WITH_NULL_NULL (org.apache.zookeeper.server.quorum.UnifiedServerSocket)
[2020-07-07 20:28:32,747] WARN Exception reading or writing challenge: {} 
(org.apache.zookeeper.server.quorum.QuorumCnxManager)
javax.net.ssl.SSLException: Connection has been shutdown: 
javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateException: 
Failed to verify both host address and host name
 at sun.security.ssl.SSLSocketImpl.checkEOF(SSLSocketImpl.java:1554)
 at sun.security.ssl.AppInputStream.read(AppInputStream.java:95)
 at 
org.apache.zookeeper.server.quorum.UnifiedServerSocket$UnifiedInputStream.read(UnifiedServerSocket.java:694)
 at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
 at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
 at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
 at java.io.DataInputStream.readFully(DataInputStream.java:195)
 at java.io.DataInputStream.readLong(DataInputStream.java:416)
 at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.handleConnection(QuorumCnxManager.java:524)
 at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.receiveConnection(QuorumCnxManager.java:478)
 at 
org.apache.zookeeper.server.quorum.QuorumCnxManager$Listener.run(QuorumCnxManager.java:934)
Caused by: javax.net.ssl.SSLHandshakeException: 
java.security.cert.CertificateException: Failed to verify both host address and 
host name
 at sun.security.ssl.Alerts.getSSLException(Alerts.java:198)
 at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1967)
 at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:331)
 at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:325)
 at 
sun.security.ssl.ServerHandshaker.clientCertificate(ServerHandshaker.java:2055)
 at sun.security.ssl.ServerHandshaker.processMessage(ServerHandshaker.java:233)
 at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1082)
 at sun.security.ssl.Handshaker.process_record(Handshaker.java:1010)
 at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1079)
 at 
sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1388)
 at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1416)
 at sun.security.ssl.SSLSocketImpl.getSession(SSLSocketImpl.java:2309)
 at 
org.apache.zookeeper.server.quorum.UnifiedServerSocket$UnifiedSocket.detectMode(UnifiedServerSocket.java:273)
 at 
org.apache.zookeeper.server.quorum.UnifiedServerSocket$UnifiedSocket.getSocket(UnifiedServerSocket.java:301)
 at 
org.apache.zookeeper.server.quorum.UnifiedServerSocket$UnifiedSocket.access$400(UnifiedServerSocket.java:180)
 at 
org.apache.zookeeper.server.quorum.UnifiedServerSocket$UnifiedInputStream.getRealInputStream(UnifiedServerSocket.java:700)
 ... 9 more
Caused by: java.security.cert.CertificateException: Failed to verify both host 
address and host name
 at 
org.apache.zookeeper.common.ZKTrustManager.performHostVerification(ZKTrustManager.java:145)
 at 
org.apache.zookeeper.common.ZKTrustManager.checkClientTrusted(ZKTrustManager.java:74)
 at 
sun.security.ssl.ServerHandshaker.clientCertificate(ServerHandshaker.java:2037)
 ... 20 more
Caused by: javax.net.ssl.SSLPeerUnverifiedException: Certificate for 
<x-x-x-x.kubernetes.default.svc.cluster.local> doesn't match any of the subject 
alternative names: [zookeeper, zookeeper1, zookeeper2, zookeeper3, 
zookeeper1.odim.svc.cluster.local, zookeeper2.odim.svc.cluster.local, 
zookeeper3.odim.svc.cluster.local]
 at 
org.apache.zookeeper.common.ZKHostnameVerifier.matchDNSName(ZKHostnameVerifier.java:224)
 at 
org.apache.zookeeper.common.ZKHostnameVerifier.verify(ZKHostnameVerifier.java:170)
 at 
org.apache.zookeeper.common.ZKTrustManager.performHostVerification(ZKTrustManager.java:141)
 ... 22 more
[2020-07-07 20:28:32,772] WARN Cannot open channel to 2 at election address 
zookeeper2/10.100.210.113:3888 
(org.apache.zookeeper.server.quorum.QuorumCnxManager)
java.net.SocketException: Broken pipe (Write failed)
 at java.net.SocketOutputStream.socketWrite0(Native Method)
 at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111)
 at java.net.SocketOutputStream.write(SocketOutputStream.java:155)
 at sun.security.ssl.OutputRecord.writeBuffer(OutputRecord.java:431)
 at sun.security.ssl.OutputRecord.write(OutputRecord.java:417)
 at sun.security.ssl.SSLSocketImpl.writeRecordInternal(SSLSocketImpl.java:894)
 at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:865)
 at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:735)
 at sun.security.ssl.Handshaker.sendChangeCipherSpec(Handshaker.java:1189)
 at 
sun.security.ssl.ClientHandshaker.sendChangeCipherAndFinish(ClientHandshaker.java:1323)
 at 
sun.security.ssl.ClientHandshaker.serverHelloDone(ClientHandshaker.java:1233)
 at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:372)
 at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1082)
 at sun.security.ssl.Handshaker.process_record(Handshaker.java:1010)
 at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1079)
 at 
sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1388)
 at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1416)
 at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1400)
 at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:650)
 at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:713)
 at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:626)
 at 
org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:477)
 at 
org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:456)
 at java.lang.Thread.run(Thread.java:748)
[2020-07-07 20:28:32,773] INFO Notification: 2 (message format version), 1 
(n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 1 (n.sid), 0x0 
(n.peerEPoch), LOOKING (my state)0 (n.config version) 
(org.apache.zookeeper.server.quorum.FastLeaderElection)
            Reporter: Bharath B


Have deployed zookeeper(v3.5.7) included in the kafka(v2.5.0) bundle as 
containers using kubernetes, where 3 instances of each kafka and zookeeper is 
deployed. Interaction among kafka brokers, with kafka client and kafka with 
zookeeper are all TLS based and is working as expected. But zookeeper quorum 
formation fails with TLS handshake error, as the server name in the https 
request does not match with any of the SANs in the certificate configured for 
zookeeper server. Server name in the request is of the form 
"x-x-x-x.kubernetes.default.svc.cluster.local" (where x-x-x-x is the IP address 
of the POD), and I am unable to understand the reason behind pre-pending FQDN 
with a IP address. Could anyone please let me know, if I am missing any 
configuration or the behavior observed is as designed. Like in kafka, we don't 
have "ssl.client.auth" confiuration parameter for zookeeper, so I am not so 
sure, if it's client or server validation failing during the handshake.

Please find below the extract of the error logs from the zookeeper1 POD

{color:#de350b}ERROR Failed to verify host address: x.x.x.x 
(org.apache.zookeeper.common.ZKTrustManager){color}
{color:#de350b}javax.net.ssl.SSLPeerUnverifiedException: Certificate for 
<x.x.x.x> doesn't match any of the subject alternative names: [zookeeper, 
zookeeper1, zookeeper2, zookeeper3, zookeeper1.odim.svc.cluster.local, 
zookeeper2.odim.svc.cluster.local, zookeeper3.odim.svc.cluster.local]{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to