[ 
https://issues.apache.org/jira/browse/HDDS-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17759479#comment-17759479
 ] 

Sammi Chen commented on HDDS-4226:
----------------------------------

Hi [~georgeJahad], do you plan to work on this in a near future?

> Cleanup OM snapshots left after a failed installSnapshot
> --------------------------------------------------------
>
>                 Key: HDDS-4226
>                 URL: https://issues.apache.org/jira/browse/HDDS-4226
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: Ozone Manager
>            Reporter: Mukul Kumar Singh
>            Assignee: George Jahad
>            Priority: Major
>              Labels: MiniOzoneChaosCluster
>
> Ozonemanager tries to install the snapshot
> {code:java}
> 2020-09-09 22:07:14,830 [pool-144-thread-1] INFO  om.OzoneManager 
> (OzoneManager.java:installCheckpoint(3159)) - Installing checkpoint with 
> OMTransactionInfo 2#68754
> 2020-09-09 22:07:14,831 [grpc-default-executor-50] INFO  impl.RaftServerImpl 
> (RaftServerImpl.java:installSnapshot(1127)) - omNode-2@group-D62218D261DE: 
> reply installSnapshot: omNode-1<-omNode-2#0:FAIL-t2,IN
> _PROGRESS
> {code}
> It failed because of the issues from HDDS-4224.
> {code:java}
> 2020-09-09 22:07:14,831 [pool-144-thread-1] ERROR om.OzoneManager 
> (OzoneManager.java:installSnapshotFromLeader(3141)) - Failed to install 
> snapshot from Leader OM: {}
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.ozone.om.OzoneManager.installCheckpoint(OzoneManager.java:3168)
>         at 
> org.apache.hadoop.ozone.om.OzoneManager.installCheckpoint(OzoneManager.java:3162)
>         at 
> org.apache.hadoop.ozone.om.OzoneManager.installSnapshotFromLeader(OzoneManager.java:3139)
>         at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$notifyInstallSnapshotFromLeader$4(OzoneManagerStateMachine.java:372)
>         at 
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> {code}
>  
> The checkpoint is left in the snapshot directory.
> {code:java}
> ➜  chaos-2020-09-09-22-05-33-IST ls 
> MiniOzoneClusterImpl-71baac34-2321-4756-ba1e-5834c5628047/omNode-2/ratis/snapshot/om.db-omNode-1-1599669
> om.db-omNode-1-1599669432684/  om.db-omNode-1-1599669451421/  
> om.db-omNode-1-1599669478149/  om.db-omNode-1-1599669504818/  
> om.db-omNode-1-1599669533577/  om.db-omNode-1-1599669566509/
> om.db-omNode-1-1599669433775/  om.db-omNode-1-1599669453030/  
> om.db-omNode-1-1599669480273/  om.db-omNode-1-1599669507385/  
> om.db-omNode-1-1599669535603/  om.db-omNode-1-1599669568325/
> om.db-omNode-1-1599669434867/  om.db-omNode-1-1599669454688/  
> om.db-omNode-1-1599669482206/  om.db-omNode-1-1599669509373/  
> om.db-omNode-1-1599669537716/  om.db-omNode-1-1599669570186/
> om.db-omNode-1-1599669435886/  om.db-omNode-1-1599669456346/  
> om.db-omNode-1-1599669484256/  om.db-omNode-1-1599669511241/  
> om.db-omNode-1-1599669540574/  om.db-omNode-1-1599669572150/
> om.db-omNode-1-1599669437199/  om.db-omNode-1-1599669458194/  
> om.db-omNode-1-1599669486200/  om.db-omNode-1-1599669513051/  
> om.db-omNode-1-1599669543136/  om.db-omNode-1-1599669574811/
> om.db-omNode-1-1599669438519/  om.db-omNode-1-1599669459992/  
> om.db-omNode-1-1599669487968/  om.db-omNode-1-1599669515343/  
> om.db-omNode-1-1599669546272/  om.db-omNode-1-1599669576833/
> om.db-omNode-1-1599669439819/  om.db-omNode-1-1599669461897/  
> om.db-omNode-1-1599669490218/  om.db-omNode-1-1599669517332/  
> om.db-omNode-1-1599669548363/  om.db-omNode-1-1599669578680/
> om.db-omNode-1-1599669441209/  om.db-omNode-1-1599669463871/  
> om.db-omNode-1-1599669492005/  om.db-omNode-1-1599669519320/  
> om.db-omNode-1-1599669551596/  om.db-omNode-1-1599669580427/
> om.db-omNode-1-1599669442606/  om.db-omNode-1-1599669465810/  
> om.db-omNode-1-1599669493727/  om.db-omNode-1-1599669521491/  
> om.db-omNode-1-1599669554153/  om.db-omNode-1-1599669582124/
> om.db-omNode-1-1599669443967/  om.db-omNode-1-1599669467909/  
> om.db-omNode-1-1599669495587/  om.db-omNode-1-1599669523436/  
> om.db-omNode-1-1599669556370/  om.db-omNode-1-1599669583768/
> om.db-omNode-1-1599669445468/  om.db-omNode-1-1599669470054/  
> om.db-omNode-1-1599669497445/  om.db-omNode-1-1599669525567/  
> om.db-omNode-1-1599669558461/  om.db-omNode-1-1599669585501/
> om.db-omNode-1-1599669446937/  om.db-omNode-1-1599669472125/  
> om.db-omNode-1-1599669499362/  om.db-omNode-1-1599669527648/  
> om.db-omNode-1-1599669560578/
> om.db-omNode-1-1599669448360/  om.db-omNode-1-1599669474051/  
> om.db-omNode-1-1599669501269/  om.db-omNode-1-1599669529648/  
> om.db-omNode-1-1599669562666/
> om.db-omNode-1-1599669449867/  om.db-omNode-1-1599669476078/  
> om.db-omNode-1-1599669503036/  om.db-omNode-1-1599669531573/  
> om.db-omNode-1-1599669564620/ {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to