[ 
https://issues.apache.org/jira/browse/HDDS-9420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krishna Kumar Asawa reassigned HDDS-9420:
-----------------------------------------

    Assignee: Sammi Chen  (was: István Fajth)

> Enabling GRPC encryption causes SCM startup failure.  
> ------------------------------------------------------
>
>                 Key: HDDS-9420
>                 URL: https://issues.apache.org/jira/browse/HDDS-9420
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Sadanand Shenoy
>            Assignee: Sammi Chen
>            Priority: Major
>
> HDDS-8178 added a feature to support multiple sub CA certs in trust chain, In 
> SCM constructor if security is enabled and hdds.grpc.tls.enabled is true it 
> tries to load the keyStoresFactory
> {code:java}
> if (conf.isSecurityEnabled() && conf.isGrpcTlsEnabled()) {
>   KeyStoresFactory serverKeyFactory =
>       certificateClient.getServerKeyStoresFactory(); {code}
> This in turn calls loadKeyManager which tries to load the entire trust chain 
> {code:java}
> private X509ExtendedKeyManager loadKeyManager(CertificateClient caClient)
>     throws GeneralSecurityException, IOException {
>   PrivateKey privateKey = caClient.getPrivateKey();
>   List<X509Certificate> newCertList = caClient.getTrustChain(); {code}
> Loading the entire trust chain does a listCA call which is network call to 
> SCMSecurityProtocolServer
> {code:java}
> public List<String> updateCAList() throws IOException {
>   pemEncodedCACertsLock.lock();
>   try {
>     pemEncodedCACerts = getScmSecureClient().listCACertificate(); {code}
> All of this happens inside the StorageContainerManager constructor but the 
> services in SCM are started only after constructor is initialised and 
> scm.start() is called which means it is sending a request to security server 
> before it is even started thus leading to connection refused messages in SCM 
> startup like below,
> {code:java}
> 10:45:45.506 AM             INFO      SCMRatisServerImpl starting Raft server 
> for scm:7b4b7153-eb02-443b-b8f9-3b146931674c
> 10:45:47.563 AM             INFO      RetryInvocationHandler 
> com.google.protobuf.ServiceException: java.net.ConnectException: Call From 
> <HOSTNAME>/<IP> to <HOSTNAME>:9961 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> $Proxy11.submitRequest over nodeId=node1,nodeAddress=<HOSTNAME>/<IP>:9961 
> after 1 failover attempts. Trying to failover after sleeping for 2000ms.
> 10:45:49.565 AM             INFO      RetryInvocationHandler 
> com.google.protobuf.ServiceException: java.net.ConnectException: Call From 
> <HOSTNAME>/<IP> to <HOSTNAME>:9961 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused, while invoking 
> $Proxy11.submitRequest over nodeId=node1,nodeAddress=<HOSTNAME>/<IP>:9961 
> after 2 failover attempts. Trying to failover after sleeping for 2000ms.
> (repeated) {code}
> StackTrace
> {code:java}
> java.net.ConnectException: Connection refused
>     at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at 
> java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:777)
>     at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:205)
>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:586)
>     at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:730)
>     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:843)
>     at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:430)
>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1681)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1506)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1459)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>     at com.sun.proxy.$Proxy14.submitRequest(Unknown Source)
>     at jdk.internal.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
>     at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>     at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431)
>     at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166)
>     at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158)
>     at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96)
>     at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362)
>     at com.sun.proxy.$Proxy14.submitRequest(Unknown Source)
>     at 
> org.apache.hadoop.hdds.protocolPB.SCMSecurityProtocolClientSideTranslatorPB.submitRequest(SCMSecurityProtocolClientSideTranslatorPB.java:102)
>     at 
> org.apache.hadoop.hdds.protocolPB.SCMSecurityProtocolClientSideTranslatorPB.listCACertificate(SCMSecurityProtocolClientSideTranslatorPB.java:374)
>     at 
> org.apache.hadoop.hdds.security.x509.certificate.client.DefaultCertificateClient.updateCAList(DefaultCertificateClient.java:933)
>     at 
> org.apache.hadoop.hdds.security.x509.certificate.client.DefaultCertificateClient.listCA(DefaultCertificateClient.java:921)
>     at 
> org.apache.hadoop.hdds.security.x509.certificate.client.DefaultCertificateClient.getTrustChain(DefaultCertificateClient.java:410)
>     at 
> org.apache.hadoop.hdds.security.ssl.ReloadingX509KeyManager.loadKeyManager(ReloadingX509KeyManager.java:204)
>     at 
> org.apache.hadoop.hdds.security.ssl.ReloadingX509KeyManager.<init>(ReloadingX509KeyManager.java:85)
>     at 
> org.apache.hadoop.hdds.security.ssl.PemFileBasedKeyStoresFactory.createKeyManagers(PemFileBasedKeyStoresFactory.java:83)
>     at 
> org.apache.hadoop.hdds.security.ssl.PemFileBasedKeyStoresFactory.init(PemFileBasedKeyStoresFactory.java:104)
>     at 
> org.apache.hadoop.hdds.security.x509.keys.SecurityUtil.getServerKeyStoresFactory(SecurityUtil.java:103)
>     at 
> org.apache.hadoop.hdds.security.x509.certificate.client.DefaultCertificateClient.getServerKeyStoresFactory(DefaultCertificateClient.java:948)
>     at 
> org.apache.hadoop.hdds.scm.ha.HASecurityUtils.createSCMRatisTLSConfig(HASecurityUtils.java:345)
>     at 
> org.apache.hadoop.hdds.scm.ha.SCMRatisServerImpl.<init>(SCMRatisServerImpl.java:109)
>     at 
> org.apache.hadoop.hdds.scm.ha.SCMHAManagerImpl.<init>(SCMHAManagerImpl.java:97)
>     at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.initializeSystemManagers(StorageContainerManager.java:646)
>     at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.<init>(StorageContainerManager.java:400)
>     at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.createSCM(StorageContainerManager.java:597)
>     at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.createSCM(StorageContainerManager.java:609)
>     at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter$SCMStarterHelper.start(StorageContainerManagerStarter.java:171)
>     at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter.startScm(StorageContainerManagerStarter.java:145)
>     at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter.call(StorageContainerManagerStarter.java:74)
>     at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter.call(StorageContainerManagerStarter.java:48)
>     at picocli.CommandLine.executeUserObject(CommandLine.java:1953) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to