Hi zk folks, Problem(s) ==========
One problem that we're having with a custom Trust Manager in ZK is that FIPS doesn't allow that: https://issues.apache.org/jira/browse/ZOOKEEPER-4393 In FIPS mode the only allowed TrustManager in the JDK is X509TrustManagerImpl which is the default implementation. The class is final, so extending it is not an option unfortunately. The intention behind implementing a custom trust manager in ZK was, I believe, the need for server and client-side hostname verification. Hostname verification officially is not part of the SSL/TLS protocol, it's the responsibility of an upper level protocol like HTTPS. Hacking hostname verification in the SSL handshake is nice and was working fine so far, but unfortunately breaks the FIPS standard. Another annoying issue with ZKTrustManager is the need for reverse DNS lookup. This is usually needed when the hostname of the certificate provider is not known at the time of handshake. For instance, when somebody connects the client via IP address, which is generally not recommended when TLS is active in the client-server protocol. The bigger problem I've found is in the leader election: when a peer connects with a smaller id, the node will close the existing connection and opens a new one in the other direction, based on the information received in the InitialMessage from the peer which only contains the IP address, not the hostname. Therefore TrustManager needs to perform reverse DNS lookup. Tickets about reverse DNS lookup issues: https://issues.apache.org/jira/browse/ZOOKEEPER-3860 https://issues.apache.org/jira/browse/ZOOKEEPER-4268 Proposal ======== I suggest to remove ZKTrustManager entirely from the codebase and use the built-in, FIPS-Enabled X509TrustManagerImpl instead. It has the downside of losing hostname verification, but we have an option to re- enable it in client-server communication: Netty has built-in support for it, we just need to do sslParameters.setEndpointIdentificationAlgorithm("HTTPS"); when creating the SSLEngine and that will result in a behaviour very similar to what we provide currently. I can show some examples. What we will truly lose is the hostname verification option in the Quorum and Leader Election protocols. Since in these protocols we manipulate the sockets directly, we would need to implement the verification manually. What do you think about this trade-off? Of course, we can put this change behind a feature flag "fips-mode", which will lead to a new mode in ZooKeeper that is actually less strict as the original behaviour. Regards, Andor