Okay, let's discuss it here. So, currently we have 2 clientHostnameVerification options: quorum and client. They're implemented in the Quroum509Util and ClientX509Util classes respectively and share common code in X509Util base class. The 3 different options that you described makes sense to me, but it's only meaningful in Quorum communication, so it would be difficult to implement. Separating the 2 properties and changing the type of one of them will be very cumbersome.
Actually we only have 1 setting currently which is "hostnameVerification", so properties look like: - zookeeper.ssl.hostnameVerification (client-server) - zookeeper.ssl.quorum.hostnameVerification (quorum) Both controls the server hostname verification directly and client hostname verification indirectly. Your current patch introduces 2 new properties: - zookeeper.ssl.clientHostnameVerification (client-server) - zookeeper.ssl.quorum.clientHostnameVerification (quorum) They will directly control the client hostname verification with the restriction that the server side (hostnameVerification) must be enabled in order to get them working. Default values are 'false' for client- server and 'true' for quorum. This is backward compatible and can be committed as it is (tweak the docs first). Next patch will introduce another 2 parameters which are: - zookeeper.ssl.enableReverseDnsLookup (client-server) - zookeeper.ssl.quorum.enableReverseDnsLookup (quorum) Both of them will be true by default which keeps backward compatibility. If the option is disabled for client-server, the hostname verification won't try a reverse DNS lookup to get the hostname, but still checks if the IP address is listed in the SAN. If the option is disabled for quorum, hostname verification won't do reverse lookup, but checks the hostnames listed in zoo.cfg against the SAN _and_ if it fails, will do the same with the IP address as a last resort. We also have the zookeeper.fips-mode property which will remove the entire ZK hostname verification logic and uses the built-in JDK mechanism. Wdyt? Andor On Wed, 2024-11-20 at 09:21 +0100, Sönke Liebau wrote: > Happy to split this into two patches, my main worry was that we > notice in > the second patch, that we are not compatible with the first one from > a > config perspective any more. > > The proposed `clientHostnameVerification` parameters take a boolean > value I > think. One could argue that this parameter would also be appropriate > for > the new "hostname from list of peers" logic, as that is also a > hostnameVerification, so instead of it being a bool it could become a > list > of allowed values [ "none", "peer", "full"]. > With 'peer' being the new and improved thing, 'none' disabling all > checks > and 'full' being the current behavior. Full is probably not the > correct > term for this, maybe 'legacy' or 'dns' or something along these lines > could > be appropriate. > > It might be worth deciding on this up front, to make sure we don't > have to > backtrack at some point. > > Best regards, > Sönke > > On Wed, Nov 20, 2024 at 2:31 AM Andor Molnar <an...@apache.org> > wrote: > > > Sounds like a plan, but I would do the 2 things in separate > > patches: the > > current patch covers the toggle for client hostname verification > > and I > > think it’s already ready for submit. > > > > Another patch should cover the alternate way of hostname > > verification > > based on what we discussed. Could you please update the Jira > > ticket(s) > > accordingly to make sure we’re on the same page? > > > > Thanks, > > Andor > > > > > > > > > On Nov 19, 2024, at 15:19, Sönke Liebau > > <soenke.lie...@stackable.tech.INVALID> wrote: > > > > > > It would be nice, being able to find out which peer is trying to > > > connect, > > > but as you say, we won't have the id available before the > > > handshake is > > done > > > (unless we have users stick it in the cert, and I don't think we > > > want > > that > > > :) ) > > > Anything we could do to find out would probably need to involve > > > DNS and > > > some dirty compromise.. so probably best to just not go there. > > > > > > Your idea for backwards compatible config sounds sensible to me. > > > It would probably make sense to get this done as part of > > > https://github.com/apache/zookeeper/pull/2173 and discuss > > > possible > > config > > > changes up front for everything? > > > > > > Best, Sönke > > > > > > > > > On Tue, 19 Nov 2024, 16:46 Andor Molnar, <an...@kazinczy.hu> > > > wrote: > > > > > > > Correction, sorry: > > > > > > > > > Not necessarily one SAN matches, but _the one_ matches which > > corresponds > > > > to the peer. > > > > > > > > I meant not any of the listed server names matches in zoo.cfg, > > > > but we > > > > might be able to identify which peer is trying to connect. This > > > > is > > probably > > > > not feasible, because at the time of TLS handshake the peer id > > > > is not > > yet > > > > available. > > > > > > > > We should do this in backward compatible way: client hostname > > > > verification, by default, works at is, but we introduce an > > > > option to > > > > disable reverse DNS lookups. That switches the behavior to > > > > check against > > > > list of hostnames in zoo.cfg. In addition we enable client > > > > hostname > > > > verification to be completely disabled too. > > > > > > > > Wdyt? > > > > > > > > Andor > > > > > > > > > > > > > > > > > > > > > On Nov 19, 2024, at 09:38, Andor Molnar <an...@kazinczy.hu> > > > > > wrote: > > > > > > > > > > Hi Sönke, > > > > > > > > > > Are you still working on the patch btw? > > > > > https://github.com/apache/zookeeper/pull/2173 > > > > > > > > > > (Though I’m not sure if it was you who opened it) > > > > > > > > > > Thanks for the pointer, ZooKeeper support for K8s is > > > > > unfortunately > > still > > > > far from perfect. Introducing the FIPS mode option might not > > > > have been > > the > > > > best choice, but we first faced the issue during FIPS > > > > deployment where > > you > > > > cannot use custom hostname verificator, so we had to disable it > > completely. > > > > > > > > > > I’d like to idea of > > > > > > > > > > "The ZK server could verify the SAN against the list of > > > > > servers > > > > (servers.N in the config). A peer should be able to connect on > > > > the > > quorum > > > > port if and only if at least one SAN matches at least one of > > > > the listed > > > > servers.” > > > > > > > > > > Not necessarily one SAN matches, but _the one_ matches which > > corresponds > > > > to the peer. Does that make sense? Not sure if technically > > > > feasible, but > > > > are you willing to create a patch? > > > > > > > > > > Regards, > > > > > Andor > > > > > > > > > > > > > > > > > > > > > > > > > > On Nov 19, 2024, at 03:11, Sönke Liebau > > > > <soenke.lie...@stackable.tech.INVALID> wrote: > > > > > > > > > > > > You are probably running into ZOOKEEPER-4790 [1] here. > > > > > > > > > > > > When we encountered this back in the day [2] we figured out > > > > > > that > > > > enabling > > > > > > FIPS mode bypasses all the ZK specific TLS checks and makes > > > > > > it work. > > In > > > > the > > > > > > ZK version you are on it is not yet enabled by default, you > > > > > > could > > either > > > > > > update or set *zookeeper.fips-mode *and this error > > > > > > _should_ go away. > > > > > > > > > > > > Good luck :) > > > > > > > > > > > > Best, > > > > > > Sönke > > > > > > > > > > > > [1] https://issues.apache.org/jira/browse/ZOOKEEPER-4790 > > > > > > [2] > > > > > > https://github.com/stackabletech/zookeeper-operator/issues/760 > > > > > > > > > > > > On Tue, Nov 19, 2024 at 9:08 AM Dharani (Jira) > > > > > > <j...@apache.org> > > wrote: > > > > > > > > > > > > > Dharani created ZOOKEEPER-4887: > > > > > > > ---------------------------------- > > > > > > > > > > > > > > Summary: Zookeeper quorum formation fails when > > > > > > > TLS is > > > > enabled > > > > > > > in k8s env > > > > > > > Key: ZOOKEEPER-4887 > > > > > > > URL: > > > > https://issues.apache.org/jira/browse/ZOOKEEPER-4887 > > > > > > > Project: ZooKeeper > > > > > > > Issue Type: Bug > > > > > > > Affects Versions: 3.8.3 > > > > > > > Reporter: Dharani > > > > > > > > > > > > > > > > > > > > > We have three(3) node zookeeper cluster running as a pod > > > > > > > on > > Kubernetes > > > > > > > cluster, zookeeper quorum formation fails with TLS > > > > > > > handshake error, > > as > > > > the > > > > > > > server name in the https request does not match with any > > > > > > > of the SANs > > > > in the > > > > > > > certificate configured for zookeeper server. Server name > > > > > > > in the > > > > request is > > > > > > > of the form "x-x-x- > > > > > > > x.kubernetes.default.svc.cluster.local" (where > > > > x-x-x-x > > > > > > > is the IP address of the POD), and I am unable to > > > > > > > understand the > > reason > > > > > > > behind pre-pending FQDN with a IP address. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please find below the extract of the error logs from the > > > > > > > zookeeper > > POD > > > > > > > {code:java} > > > > > > > [myid:] - ERROR [LearnerHandler-/192.168.220. > > > > > > > 10:46516:o.a.z.c.ZKTrustManager@191] - Failed to verify > > > > > > > host > > address: > > > > > > > 192.168.220.10 > > > > > > > javax.net.ssl.SSLPeerUnverifiedException: Certificate for > > > > <192.168.220.10> > > > > > > > doesn't match any of the subject alternative names: > > > > > > > [eric-data-coordinator-zk, eric-data-coordinator- > > > > > > > zk.zdhagxx1, > > > > > > > eric-data-coordinator-zk.zdhagxx1.svc, > > > > > > > eric-data-coordinator-zk.zdhagxx1.svc.cluster.local, > > > > > > > > > *.eric-data-coordinator-zk-ensemble- > > service.zdhagxx1.svc.cluster.local, > > > > > > > certified-scrape-target] > > > > > > > > > > > > > > > > > > > > org.apache.zookeeper.common.ZKHostnameVerifier.matchIPAddress(ZKHos > > tnameVerifier.java:197) > > > > > > > > > > > > > > > > > > > > org.apache.zookeeper.common.ZKHostnameVerifier.verify(ZKHostnameVer > > ifier.java:165) > > > > > > > > > > > > > > > > > > > > org.apache.zookeeper.common.ZKTrustManager.performHostVerification( > > ZKTrustManager.java:180) > > > > > > > > > > > > > > > > > > > > org.apache.zookeeper.common.ZKTrustManager.checkClientTrusted(ZKTru > > stManager.java:93) > > > > > > > > > > > > > > > > > > > > java.base/sun.security.ssl.CertificateMessage$T13CertificateConsume > > r.checkClientCerts(CertificateMessage.java:1285) > > > > > > > > > > > > > > > > > > > > java.base/sun.security.ssl.CertificateMessage$T13CertificateConsume > > r.onConsumeCertificate(CertificateMessage.java:1204) > > > > > > > > > > > > > > > > > > > > java.base/sun.security.ssl.CertificateMessage$T13CertificateConsume > > r.consume(CertificateMessage.java:1181) > > > > > > > > > java.base/sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:3 > > 92) > > > > > > > > > > > > > > > > > > > > java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeConte > > xt.java:443) > > > > > > > > > > > > > > > > > > > > java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeConte > > xt.java:421) > > > > > > > > > > > > > > > > > > > > java.base/sun.security.ssl.TransportContext.dispatch(TransportConte > > xt.java:183) > > > > > > > java.base/sun.security.ssl.SSLTransport.decode(SSLTranspo > > > > > > > rt.java:172) > > > > > > > > > > > java.base/sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.j > > > > ava:1511) > > > > > > > > > > > > > > > > > > > > java.base/sun.security.ssl.SSLSocketImpl.readHandshakeRecord(SSLSoc > > ketImpl.java:1421) > > > > > > > > > > > > > > > > > > > > java.base/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketIm > > pl.java:456) > > > > > > > > > > > > > > > > > > > > java.base/sun.security.ssl.SSLSocketImpl.ensureNegotiated(SSLSocket > > Impl.java:926) > > > > > > > > > > > > > java.base/sun.security.ssl.SSLSocketImpl.getSession(SSLSocketImpl.j > > ava:372) > > > > > > > > > > > > > > > > > > > > org.apache.zookeeper.server.quorum.UnifiedServerSocket$UnifiedSocke > > t.detectMode(UnifiedServerSocket.java:269) > > > > > > > > > > > > > > > > > > > > org.apache.zookeeper.server.quorum.UnifiedServerSocket$UnifiedSocke > > t.getSocket(UnifiedServerSocket.java:298) > > > > > > > > > > > > > > > > > > > > org.apache.zookeeper.server.quorum.UnifiedServerSocket$UnifiedSocke > > t.access$400(UnifiedServerSocket.java:172) > > > > > > > > > > > > > org.apache.zookeeper.server.quorum.UnifiedServerSocket$UnifiedInput > > Stream.getRealInputStream(UnifiedServerSocket.java:699) > > > > > > > > > > > > > org.apache.zookeeper.server.quorum.UnifiedServerSocket$UnifiedInput > > Stream.read(UnifiedServerSocket.java:693) > > > > > > > java.base/java.io > > > > .BufferedInputStream.fill(BufferedInputStream.java:252) > > > > > > > java.base/java.io > > > > .BufferedInputStream.read(BufferedInputStream.java:271) > > > > > > > java.base/java.io.DataInputStream.readInt(DataInputStream > > > > > > > .java:392) > > > > > > > > > org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java: > > 96) > > > > > > > > > > > > > > > > > > > > org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumP > > acket.java:86) > > > > > > > > > > > > > org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.ja > > va:134) > > > > > > > > > > > > > org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandle > > r.java:472)[myid:] > > > > > > > - ERROR [LearnerHandler-/192.168.220. > > > > 10:46516:o.a.z.c.ZKTrustManager@192] > > > > > > > - Failed to verify hostname: > > > > > > > 192-168-220-10.eric-data-coordinator- > > > > > > > zk.zdhagxx1.svc.cluster.local > > > > > > > javax.net.ssl.SSLPeerUnverifiedException: Certificate for > > > > > > > <192-168-220-10.eric-data-coordinator- > > > > > > > zk.zdhagxx1.svc.cluster.local> > > > > > > > doesn't match any of the subject alternative names: > > > > > > > [eric-data-coordinator-zk, eric-data-coordinator- > > > > > > > zk.zdhagxx1, > > > > > > > eric-data-coordinator-zk.zdhagxx1.svc, > > > > > > > eric-data-coordinator-zk.zdhagxx1.svc.cluster.local, > > > > > > > > > *.eric-data-coordinator-zk-ensemble- > > service.zdhagxx1.svc.cluster.local, > > > > > > > certified-scrape-target] > > > > > > > > > > > > > org.apache.zookeeper.common.ZKHostnameVerifier.matchDNSName(ZKHostn > > ameVerifier.java:230) > > > > > > > > > > > > > org.apache.zookeeper.common.ZKHostnameVerifier.verify(ZKHostnameVer > > ifier.java:171) > > > > > > > > > > > > > > > > > > > > org.apache.zookeeper.common.ZKTrustManager.performHostVerification( > > ZKTrustManager.java:189) > > > > > > > > > > > > > > > > > > > > org.apache.zookeeper.common.ZKTrustManager.checkClientTrusted(ZKTru > > stManager.java:93) > > > > > > > > > > > > > java.base/sun.security.ssl.CertificateMessage$T13CertificateConsume > > r.checkClientCerts(CertificateMessage.java:1285) > > > > > > > > > > > > > java.base/sun.security.ssl.CertificateMessage$T13CertificateConsume > > r.onConsumeCertificate(CertificateMessage.java:1204) > > > > > > > > > > > > > java.base/sun.security.ssl.CertificateMessage$T13CertificateConsume > > r.consume(CertificateMessage.java:1181) > > > > > > > > > java.base/sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:3 > > 92) > > > > > > > > > > > > > > > > > > > > java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeConte > > xt.java:443) > > > > > > > > > > > > > > > > > > > > java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeConte > > xt.java:421) > > > > > > > > > > > > > > > > > > > > java.base/sun.security.ssl.TransportContext.dispatch(TransportConte > > xt.java:183) > > > > > > > java.base/sun.security.ssl.SSLTransport.decode(SSLTranspo > > > > > > > rt.java:172) > > > > > > > > > > > java.base/sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.j > > > > ava:1511) > > > > > > > > > > > > > > > > > > > > java.base/sun.security.ssl.SSLSocketImpl.readHandshakeRecord(SSLSoc > > ketImpl.java:1421) > > > > > > > > > > > > > > > > > > > > java.base/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketIm > > pl.java:456) > > > > > > > > > > > > > > > > > > > > java.base/sun.security.ssl.SSLSocketImpl.ensureNegotiated(SSLSocket > > Impl.java:926) > > > > > > > > > > > > > java.base/sun.security.ssl.SSLSocketImpl.getSession(SSLSocketImpl.j > > ava:372) > > > > > > > > > > > > > > > > > > > > org.apache.zookeeper.server.quorum.UnifiedServerSocket$UnifiedSocke > > t.detectMode(UnifiedServerSocket.java:269) > > > > > > > > > > > > > > > > > > > > org.apache.zookeeper.server.quorum.UnifiedServerSocket$UnifiedSocke > > t.getSocket(UnifiedServerSocket.java:298) > > > > > > > > > > > > > > > > > > > > org.apache.zookeeper.server.quorum.UnifiedServerSocket$UnifiedSocke > > t.access$400(UnifiedServerSocket.java:172) > > > > > > > > > > > > > > > > > > > > org.apache.zookeeper.server.quorum.UnifiedServerSocket$UnifiedInput > > Stream.getRealInputStream(UnifiedServerSocket.java:699) > > > > > > > > > > > > > > > > > > > > org.apache.zookeeper.server.quorum.UnifiedServerSocket$UnifiedInput > > Stream.read(UnifiedServerSocket.java:693) > > > > > > > java.base/java.io > > > > .BufferedInputStream.fill(BufferedInputStream.java:252) > > > > > > > java.base/java.io > > > > .BufferedInputStream.read(BufferedInputStream.java:271) > > > > > > > java.base/java.io.DataInputStream.readInt(DataInputStream > > > > > > > .java:392) > > > > > > > > > org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java: > > 96) > > > > > > > > > > > > > > > > > > > > org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumP > > acket.java:86) > > > > > > > > > > > > > org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.ja > > va:134) > > > > > > > > > > > > > org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandle > > r.java:472) > > > > > > > {code} > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > This message was sent by Atlassian Jira > > > > > > > (v8.20.10#820010) > > > > > > > > > > > > > > > > > > > > > > > >