[ 
https://issues.apache.org/jira/browse/HDDS-9156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17760724#comment-17760724
 ] 

Sammi Chen commented on HDDS-9156:
----------------------------------

Another failure stack,  this is SCM4 log messages,

{code:java}
2023-08-30 17:44:15,206 [main] INFO ha.HASecurityUtils: Creating csr for 
SCM->hostName:scm4.org,scmId:e71b89a6-12de-4f0e-9f26-bab161fdde45,clusterId:CID-e9d51554-3438-41e1-b439-014b2ea5912e,subject:[email protected]
2023-08-30 17:44:16,646 [main] ERROR ha.HASecurityUtils: Error while 
fetching/storing SCM signed certificate.
org.apache.hadoop.hdds.security.exception.SCMSecurityException: The operation 
getSCMCertificate is prohibited due to root CA and sub CA rotation have just 
finished. The prohibition state will last at most PT10S. Please try the 
operation later again.
        at 
org.apache.hadoop.hdds.protocolPB.SCMSecurityProtocolClientSideTranslatorPB.handleError(SCMSecurityProtocolClientSideTranslatorPB.java:122)
        at 
org.apache.hadoop.hdds.protocolPB.SCMSecurityProtocolClientSideTranslatorPB.submitRequest(SCMSecurityProtocolClientSideTranslatorPB.java:104)
        at 
org.apache.hadoop.hdds.protocolPB.SCMSecurityProtocolClientSideTranslatorPB.getSCMCertChain(SCMSecurityProtocolClientSideTranslatorPB.java:233)
        at 
org.apache.hadoop.hdds.scm.ha.HASecurityUtils.getRootCASignedSCMCert(HASecurityUtils.java:164)
        at 
org.apache.hadoop.hdds.scm.ha.HASecurityUtils.initializeSecurity(HASecurityUtils.java:115)
        at 
org.apache.hadoop.hdds.scm.server.StorageContainerManager.scmBootstrap(StorageContainerManager.java:1187)
        at 
org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter$SCMStarterHelper.bootStrap(StorageContainerManagerStarter.java:192)
        at 
org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter.bootStrapScm(StorageContainerManagerStarter.java:135)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at picocli.CommandLine.executeUserObject(CommandLine.java:1972)
        at picocli.CommandLine.access$1300(CommandLine.java:145)
        at 
picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2346)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2311)
        at 
picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179)
        at picocli.CommandLine.execute(CommandLine.java:2078)
        at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:100)
        at org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:91)
        at 
org.apache.hadoop.hdds.scm.server.StorageContainerManagerStarter.main(StorageContainerManagerStarter.java:63)
org.apache.hadoop.hdds.security.exception.SCMSecurityException: The operation 
getSCMCertificate is prohibited due to root CA and sub CA rotation have just 
finished. The prohibition state will last at most PT10S. Please try the 
operation later again.
2023-08-30 17:44:16,665 [shutdown-hook-0] INFO 
server.StorageContainerManagerStarter: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down StorageContainerManager at scm4.org/172.25.0.120
************************************************************/
{code}

And this is the logs on the CI 


{code:java}
Creating ozonesecure-ha_scm4.org_1 ... 
Creating ozonesecure-ha_scm4.org_1 ... done
Port 9894 is not available on scm4.org yet
Port 9894 is not available on scm4.org yet
Port 9894 is not available on scm4.org yet
Port 9894 is not available on scm4.org yet
Port 9894 is not available on scm4.org yet
Port 9894 is not available on scm4.org yet
Ncat: Could not resolve hostname "scm4.org": Name or service not known. 
QUITTING.
Port 9894 is not available on scm4.org yet
Ncat: Could not resolve hostname "scm4.org": Name or service not known. 
QUITTING.
Port 9894 is not available on scm4.org yet
Ncat: Could not resolve hostname "scm4.org": Name or service not known. 
QUITTING.
Port 9894 is not available on scm4.org yet
{code}



> cert-rotation acceptance test is failing randomly
> -------------------------------------------------
>
>                 Key: HDDS-9156
>                 URL: https://issues.apache.org/jira/browse/HDDS-9156
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: Certificates, test
>            Reporter: Siyao Meng
>            Assignee: Sammi Chen
>            Priority: Major
>              Labels: pull-request-available
>
> -Despite there being no "FAIL"ed test items:-
> https://github.com/apache/ozone/actions/runs/5825414635/job/15797755458?pr=5172#logs
> ah. {{ozone admin scm roles}} command timed out:
> {code}
> ozone admin scm roles | grep scm4.org hasn't succeed yet
> Timed out waiting on ozone admin scm roles | grep scm4.org to be successful
> {code}
> cc [~pifta]
> Another failure:
> https://github.com/apache/ozone/actions/runs/5827776199/job/15804891746
> {code}
> Creating ozonesecure-ha_scm4.org_1 ... done
> Port 9894 is not available on scm4.org yet
> Port 9894 is not available on scm4.org yet
> Port 9894 is not available on scm4.org yet
> Port 9894 is not available on scm4.org yet
> Port 9894 is not available on scm4.org yet
> Port 9894 is not available on scm4.org yet
> Port 9894 is not available on scm4.org yet
> Port 9894 is not available on scm4.org yet
> Port 9894 is not available on scm4.org yet
> Port 9894 is not available on scm4.org yet
> Port 9894 is not available on scm4.org yet
> Port 9894 is not available on scm4.org yet
> Port 9894 is available on scm4.org
> ==============================================================================
> Kinit :: Kinit test user                                                      
> ==============================================================================
> Kinit                                                                 | PASS |
> ------------------------------------------------------------------------------
> Kinit :: Kinit test user                                              | PASS |
> 1 test, 1 passed, 0 failed
> ==============================================================================
> Output:  /tmp/smoketest/ozonesecure-ha/result/robot-5.xml
> ozone admin scm roles | grep scm4.org hasn't succeed yet
> ozone admin scm roles | grep scm4.org hasn't succeed yet
> ozone admin scm roles | grep scm4.org hasn't succeed yet
> ozone admin scm roles | grep scm4.org hasn't succeed yet
> ozone admin scm roles | grep scm4.org hasn't succeed yet
> Timed out waiting on ozone admin scm roles | grep scm4.org to be successful
> Stopping ozonesecure-ha_scm4.org_1  ... 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to