[ 
https://issues.apache.org/jira/browse/HDDS-9420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17777116#comment-17777116
 ] 

Sammi Chen commented on HDDS-9420:
----------------------------------

[~pifta],  the RFC-5246 says that the self-signed certificate MAY be omitted 
from the chain, which means it's fine to include it in the chain. I'm fine with 
the current behavior.  But I'm also fine if you want to remove the root 
certificate from the bundle.

> Enabling GRPC encryption causes SCM startup failure.  
> ------------------------------------------------------
>
>                 Key: HDDS-9420
>                 URL: https://issues.apache.org/jira/browse/HDDS-9420
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: Security
>    Affects Versions: 1.4.0
>            Reporter: Sadanand Shenoy
>            Assignee: Sammi Chen
>            Priority: Blocker
>
> HDDS-8178 added a feature to support multiple sub CA certs in trust chain, In 
> SCM constructor if security is enabled and hdds.grpc.tls.enabled is true it 
> tries to load the keyStoresFactory
> {code:java}
> if (conf.isSecurityEnabled() && conf.isGrpcTlsEnabled()) {
>   KeyStoresFactory serverKeyFactory =
>       certificateClient.getServerKeyStoresFactory(); {code}
> This in turn calls loadKeyManager which tries to load the entire trust chain 
> {code:java}
> private X509ExtendedKeyManager loadKeyManager(CertificateClient caClient)
>     throws GeneralSecurityException, IOException {
>   PrivateKey privateKey = caClient.getPrivateKey();
>   List<X509Certificate> newCertList = caClient.getTrustChain(); {code}
> Loading the entire trust chain does a listCA call which is network call to 
> SCMSecurityProtocolServer
> {code:java}
> public List<String> updateCAList() throws IOException {
>   pemEncodedCACertsLock.lock();
>   try {
>     pemEncodedCACerts = getScmSecureClient().listCACertificate(); {code}
> All of this happens inside the StorageContainerManager constructor but the 
> services in SCM are started only after constructor is initialised and 
> scm.start() is called which means it is sending a request to security server 
> before it is even started thus leading to connection refused messages in SCM 
> startup like below,
> {code:java}
> 10:45:45.506 AM             INFO      SCMRatisServerImpl starting Raft server 
> for scm:7b4b7153-eb02-443b-b8f9-3b146931674c
> 10:45:47.563 AM             INFO      RetryInvocationHandler 
> com.google.protobuf.ServiceException: java.net.ConnectException: Call From 
> <HOSTNAME>/<IP> to <HOSTNAME>:9961 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> $Proxy11.submitRequest over nodeId=node1,nodeAddress=<HOSTNAME>/<IP>:9961 
> after 1 failover attempts. Trying to failover after sleeping for 2000ms.
> 10:45:49.565 AM             INFO      RetryInvocationHandler 
> com.google.protobuf.ServiceException: java.net.ConnectException: Call From 
> <HOSTNAME>/<IP> to <HOSTNAME>:9961 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> $Proxy11.submitRequest over nodeId=node1,nodeAddress=<HOSTNAME>/<IP>:9961 
> after 2 failover attempts. Trying to failover after sleeping for 2000ms.
> (repeated) {code}
> StackTrace
> {code:java}
> java.net.ConnectException: Connection refused
>     at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at 
> java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:777)
>     at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:205)
>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:586)
>     at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:730)
>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:843)
>     at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:430)
>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1681)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1506)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1459)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>     at com.sun.proxy.$Proxy14.submitRequest(Unknown Source)
>     at jdk.internal.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
>     at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>     at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431)
>     at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
>     at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
>     at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
>     at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
>     at com.sun.proxy.$Proxy14.submitRequest(Unknown Source)
>     at 
> org.apache.hadoop.hdds.protocolPB.SCMSecurityProtocolClientSideTranslatorPB.submitRequest(SCMSecurityProtocolClientSideTranslatorPB.java:102)
>     at 
> org.apache.hadoop.hdds.protocolPB.SCMSecurityProtocolClientSideTranslatorPB.listCACertificate(SCMSecurityProtocolClientSideTranslatorPB.java:374)
>     at 
> org.apache.hadoop.hdds.security.x509.certificate.client.DefaultCertificateClient.updateCAList(DefaultCertificateClient.java:933)
>     at 
> org.apache.hadoop.hdds.security.x509.certificate.client.DefaultCertificateClient.listCA(DefaultCertificateClient.java:921)
>     at 
> org.apache.hadoop.hdds.security.x509.certificate.client.DefaultCertificateClient.getTrustChain(DefaultCertificateClient.java:410)
>     at 
> org.apache.hadoop.hdds.security.ssl.ReloadingX509KeyManager.loadKeyManager(ReloadingX509KeyManager.java:204)
>     at 
> org.apache.hadoop.hdds.security.ssl.ReloadingX509KeyManager.<init>(ReloadingX509KeyManager.java:85)
>     at 
> org.apache.hadoop.hdds.security.ssl.PemFileBasedKeyStoresFactory.createKeyManagers(PemFileBasedKeyStoresFactory.java:83)
>     at 
> org.apache.hadoop.hdds.security.ssl.PemFileBasedKeyStoresFactory.init(PemFileBasedKeyStoresFactory.java:104)
>     at 
> org.apache.hadoop.hdds.security.x509.keys.SecurityUtil.getServerKeyStoresFactory(SecurityUtil.java:103)
>     at 
> org.apache.hadoop.hdds.security.x509.certificate.client.DefaultCertificateClient.getServerKeyStoresFactory(DefaultCertificateClient.java:948)
>     at 
> org.apache.hadoop.hdds.scm.ha.HASecurityUtils.createSCMRatisTLSConfig(HASecurityUtils.java:345)
>     at 
> org.apache.hadoop.hdds.scm.ha.SCMRatisServerImpl.<init>(SCMRatisServerImpl.java:109)
>     at 
> org.apache.hadoop.hdds.scm.ha.SCMHAManagerImpl.<init>(SCMHAManagerImpl.java:97)
>     at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.initializeSystemManagers(StorageContainerManager.java:646)
>     at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.<init>(StorageContainerManager.java:400)
>     at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.createSCM(StorageContainerManager.java:597)
>     at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.createSCM(StorageContainerManager.java:609)
>     at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter$SCMStarterHelper.start(StorageContainerManagerStarter.java:171)
>     at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter.startScm(StorageContainerManagerStarter.java:145)
>     at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter.call(StorageContainerManagerStarter.java:74)
>     at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter.call(StorageContainerManagerStarter.java:48)
>     at picocli.CommandLine.executeUserObject(CommandLine.java:1953) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to