[ 
https://issues.apache.org/jira/browse/HDDS-10626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDDS-10626.
------------------------------------
    Fix Version/s: HDDS-7593
       Resolution: Fixed

> [LeaseRecovery] OM shuts down with "SecretKey client must have been 
> initialized already"
> ----------------------------------------------------------------------------------------
>
>                 Key: HDDS-10626
>                 URL: https://issues.apache.org/jira/browse/HDDS-10626
>             Project: Apache Ozone
>          Issue Type: Sub-task
>          Components: OM
>            Reporter: Pratyush Bhatt
>            Assignee: Sammi Chen
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: HDDS-7593
>
>
> In a scenario where I'm conducting lease recovery on multiple files during a 
> rolling restart, the OM encounters abrupt failure subsequent to the restart 
> of Ozone Managers (OMs). 
> {code:java}
> 2024-03-31 09:47:01,866 ERROR [om72-OMStateMachineApplyTransactionThread - 
> 0]-org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine: Terminating 
> with exit status 1: Request cmdType: RecoverLease
> traceID: ""
> clientId: "client-433C04E5C8CC"
> userInfo {
>   userName: "hdfs@XYZ"
>   remoteAddress: "xx.yy.ww.zz"
>   hostName: "vb1307.xyz.com"
> }
> version: 3
> layoutVersion {
>   version: 6
> }
> RecoverLeaseRequest {
>   volumeName: "hsyncvol"
>   bucketName: "hsyncbuck"
>   keyName: "hsync/File_24.txt"
>   force: false
> }
>  failed with exception
> java.lang.NullPointerException: SecretKey client must have been initialized 
> already.
>         at java.util.Objects.requireNonNull(Objects.java:228)
>         at 
> org.apache.hadoop.hdds.security.symmetric.DefaultSecretKeySignerClient.getCurrentSecretKey(DefaultSecretKeySignerClient.java:70)
>         at 
> org.apache.hadoop.hdds.security.token.ShortLivedTokenSecretManager.createPassword(ShortLivedTokenSecretManager.java:47)
>         at 
> org.apache.hadoop.hdds.security.token.OzoneBlockTokenSecretManager.generateToken(OzoneBlockTokenSecretManager.java:70)
>         at 
> org.apache.hadoop.ozone.om.request.file.OMRecoverLeaseRequest.updateBlockInfo(OMRecoverLeaseRequest.java:281)
>         at 
> org.apache.hadoop.ozone.om.request.file.OMRecoverLeaseRequest.doWork(OMRecoverLeaseRequest.java:264)
>         at 
> org.apache.hadoop.ozone.om.request.file.OMRecoverLeaseRequest.validateAndUpdateCache(OMRecoverLeaseRequest.java:156)
>         at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.lambda$0(OzoneManagerRequestHandler.java:406)
>         at 
> org.apache.hadoop.util.MetricUtil.captureLatencyNs(MetricUtil.java:45)
>         at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleWriteRequestImpl(OzoneManagerRequestHandler.java:404)
>         at 
> org.apache.hadoop.ozone.protocolPB.RequestHandler.handleWriteRequest(RequestHandler.java:63)
>         at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.runCommand(OzoneManagerStateMachine.java:525)
>         at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$1(OzoneManagerStateMachine.java:343)
>         at 
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748) {code}
> Have seen this 2-3 times, and this time I was able to repro it when Lease 
> recovery is happening during RR phase.
> cc: [~ashishk] [~weichiu] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to