bharatviswa504 opened a new pull request #2312:
URL: https://github.com/apache/ozone/pull/2312


   ## What changes were proposed in this pull request?
   
   On SCM check if it is SCMSecurityException with errorCode NOT_A_PRIMARY_SCM 
return a RetriableWithFailOverException. In this way, FailOverProxyProvider 
performs failOver and Retry to the next SCM.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-5317
   
   ## How was this patch tested?
   
   Tested manually on docker-compose where changed the order of node ids to 
scm2,scm3,scm1
   And started SCM3, so it will connect to scm2, and see whether it is able to 
bootstrap or not.
   
   SCM3 connected to SCM2 and it is throwing RetriableWithFailOverException.
   ```
   scm2.org_1   | org.apache.hadoop.hdds.scm.ha.RetriableWithFailOverException: 
org.apache.hadoop.hdds.security.exception.SCMSecurityException: Get SCM 
Certificate can be run only primary SCM
   scm2.org_1   |       at 
org.apache.hadoop.hdds.scm.ha.RatisUtil.checkRatisException(RatisUtil.java:206)
   scm2.org_1   |       at 
org.apache.hadoop.hdds.scm.protocol.SCMSecurityProtocolServerSideTranslatorPB.processRequest(SCMSecurityProtocolServerSideTranslatorPB.java:157)
   scm2.org_1   |       at 
org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:87)
   scm2.org_1   |       at 
org.apache.hadoop.hdds.scm.protocol.SCMSecurityProtocolServerSideTranslatorPB.submitRequest(SCMSecurityProtocolServerSideTranslatorPB.java:97)
   scm2.org_1   |       at 
org.apache.hadoop.hdds.protocol.proto.SCMSecurityProtocolProtos$SCMSecurityProtocolService$2.callBlockingMethod(SCMSecurityProtocolProtos.java:15124)
   scm2.org_1   |       at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
   scm2.org_1   |       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
   scm2.org_1   |       at 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1029)
   scm2.org_1   |       at 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:957)
   scm2.org_1   |       at 
java.base/java.security.AccessController.doPrivileged(Native Method)
   scm2.org_1   |       at 
java.base/javax.security.auth.Subject.doAs(Subject.java:423)
   scm2.org_1   |       at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
   scm2.org_1   |       at 
org.apache.hadoop.ipc.Server$Handler.run(Server.java:2957)
   scm2.org_1   | Caused by: 
org.apache.hadoop.hdds.security.exception.SCMSecurityException: Get SCM 
Certificate can be run only primary SCM
   scm2.org_1   |       at 
org.apache.hadoop.hdds.scm.server.SCMSecurityProtocolServer.getSCMCertificate(SCMSecurityProtocolServer.java:200)
   scm2.org_1   |       at 
org.apache.hadoop.hdds.scm.protocol.SCMSecurityProtocolServerSideTranslatorPB.getSCMCertificate(SCMSecurityProtocolServerSideTranslatorPB.java:228)
   scm2.org_1   |       at 
org.apache.hadoop.hdds.scm.protocol.SCMSecurityProtocolServerSideTranslatorPB.processRequest(SCMSecurityProtocolServerSideTranslatorPB.java:127)
   scm2.org_1   |       ... 11 more
   ```
   
   SCM3 bootstrap is successful.
   ```
   scm3.org_1   | 2021-06-08 08:11:53,076 [main] INFO 
server.StorageContainerManager: SCM BootStrap  is successful for ClusterID 
CID-74d4b242-a5d7-4b07-8677-f75f0207c0e8, SCMID 
d7a4c94b-423a-45ae-b04a-9474584206d1
   scm3.org_1   | 2021-06-08 08:11:53,076 [main] INFO 
server.StorageContainerManager: Primary SCM Node ID 
4f54d4de-8942-47b0-a88e-99e5d1bbcad7
   scm3.org_1   | 2021-06-08 08:11:53,086 [shutdown-hook-0] INFO 
server.StorageContainerManagerStarter: SHUTDOWN_MSG:
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to