[ 
https://issues.apache.org/jira/browse/HDDS-7985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17690082#comment-17690082
 ] 

Neil Joshi commented on HDDS-7985:
----------------------------------

cc [~nanda] 

> [SCM HA] On SCM Disk failure recovery causes Datanode Failure on startup 
> -------------------------------------------------------------------------
>
>                 Key: HDDS-7985
>                 URL: https://issues.apache.org/jira/browse/HDDS-7985
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Neil Joshi
>            Priority: Major
>
> Recovery from an SCM disk failure when no backup is avail requires,
>  * Clean _ozone.scm.db.dirs_ __ and __ _ozone.metadata.dirs_ locations
> and bootstrapping the SCM.  Whether SCM is primodial or not an error occurs 
> when recovering from a failed disk with no backup when starting a datanode 
> after SCM recovery. 
>  
> Datanodes brought up after SCM disk failure recovery are unable to start due 
> to a CA certificate error observed, stating the number of certificates 
> received from the SCM is greater than the number expected:
> {code:java}
> ozonesecure-ha-datanode1-1  | 2023-02-17 00:46:40 INFO  HAUtils:457 - 
> Expected CA list size 4, where as received CA List size 5.{code}
> In this case when listing the certificates stored by the SCM, it reports a 
> total of 5 scm certificates after SCM2 recovers from disk failure:
>  
> {code:java}
> [email protected]
> [email protected]
> [email protected]
> [email protected]
> [email protected]
>  
> {code}
> It appears to have 2 entries for SCM 2 (the scm disk failure recovery node)
>  
> $ ozone admin certs list
> bash-4.2$ ozone admin cert list
> {code:java}
> Total 12 valid certificates: 
> SerialNumber      Valid From                     Expiry                       
>   Subject                                                                     
>                                   
> 1                 Fri Feb 17 00:00:00 UTC 2023   Mon Mar 27 00:00:00 UTC 2028 
>   O=CID-abb46225-77ba-4132-ac6e-96792b40450c, 
> OU=f02b032a-7da0-4132-8a31-61c3d078e6cb, [email protected]          
> 10760186198072    Fri Feb 17 00:00:00 UTC 2023   Mon Mar 27 00:00:00 UTC 2028 
>   O=CID-abb46225-77ba-4132-ac6e-96792b40450c, 
> OU=f02b032a-7da0-4132-8a31-61c3d078e6cb, [email protected]      
> 10779888473070    Fri Feb 17 00:00:00 UTC 2023   Sat Feb 17 00:00:00 UTC 2024 
>   O=CID-abb46225-77ba-4132-ac6e-96792b40450c, 
> OU=f02b032a-7da0-4132-8a31-61c3d078e6cb, CN=recon@recon           
> 10780166036417    Fri Feb 17 00:00:00 UTC 2023   Mon Mar 27 00:00:00 UTC 2028 
>   O=CID-abb46225-77ba-4132-ac6e-96792b40450c, 
> OU=f99f1a81-7cce-44c9-a09b-9f7bbc48b6ac, [email protected]      
> 10788394717480    Fri Feb 17 00:00:00 UTC 2023   Mon Mar 27 00:00:00 UTC 2028 
>   O=CID-abb46225-77ba-4132-ac6e-96792b40450c, 
> OU=598be6bc-7d86-4cab-84dc-668a162a7ec2, [email protected]      
> 10800769855768    Fri Feb 17 00:00:00 UTC 2023   Sat Feb 17 00:00:00 UTC 2024 
>   O=CID-abb46225-77ba-4132-ac6e-96792b40450c, 
> OU=f02b032a-7da0-4132-8a31-61c3d078e6cb, CN=dn@bd3138308a3f       
> 10801305457014    Fri Feb 17 00:00:00 UTC 2023   Sat Feb 17 00:00:00 UTC 2024 
>   O=CID-abb46225-77ba-4132-ac6e-96792b40450c, 
> OU=f02b032a-7da0-4132-8a31-61c3d078e6cb, CN=dn@e4795cc77124       
> 10801871334038    Fri Feb 17 00:00:00 UTC 2023   Sat Feb 17 00:00:00 UTC 2024 
>   O=CID-abb46225-77ba-4132-ac6e-96792b40450c, 
> OU=f02b032a-7da0-4132-8a31-61c3d078e6cb, CN=dn@3eb28ff965a1       
> 10803980992569    Fri Feb 17 00:00:00 UTC 2023   Sat Feb 17 00:00:00 UTC 2024 
>   O=CID-abb46225-77ba-4132-ac6e-96792b40450c, 
> OU=f02b032a-7da0-4132-8a31-61c3d078e6cb, CN=om2                   
> 10804543987939    Fri Feb 17 00:00:00 UTC 2023   Sat Feb 17 00:00:00 UTC 2024 
>   O=CID-abb46225-77ba-4132-ac6e-96792b40450c, 
> OU=f02b032a-7da0-4132-8a31-61c3d078e6cb, CN=om3                   
> 10806118720884    Fri Feb 17 00:00:00 UTC 2023   Sat Feb 17 00:00:00 UTC 2024 
>   O=CID-abb46225-77ba-4132-ac6e-96792b40450c, 
> OU=f02b032a-7da0-4132-8a31-61c3d078e6cb, CN=om1                   
> 10932809284268    Fri Feb 17 00:00:00 UTC 2023   Mon Mar 27 00:00:00 UTC 2028 
>   O=CID-abb46225-77ba-4132-ac6e-96792b40450c, 
> OU=b4a175f3-c6a4-47fd-bcc5-c081b03de8c7, [email protected]      {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to