[
https://issues.apache.org/jira/browse/HDDS-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17759479#comment-17759479
]
Sammi Chen commented on HDDS-4226:
----------------------------------
Hi [~georgeJahad], do you plan to work on this in a near future?
> Cleanup OM snapshots left after a failed installSnapshot
> --------------------------------------------------------
>
> Key: HDDS-4226
> URL: https://issues.apache.org/jira/browse/HDDS-4226
> Project: Apache Ozone
> Issue Type: Bug
> Components: Ozone Manager
> Reporter: Mukul Kumar Singh
> Assignee: George Jahad
> Priority: Major
> Labels: MiniOzoneChaosCluster
>
> Ozonemanager tries to install the snapshot
> {code:java}
> 2020-09-09 22:07:14,830 [pool-144-thread-1] INFO om.OzoneManager
> (OzoneManager.java:installCheckpoint(3159)) - Installing checkpoint with
> OMTransactionInfo 2#68754
> 2020-09-09 22:07:14,831 [grpc-default-executor-50] INFO impl.RaftServerImpl
> (RaftServerImpl.java:installSnapshot(1127)) - omNode-2@group-D62218D261DE:
> reply installSnapshot: omNode-1<-omNode-2#0:FAIL-t2,IN
> _PROGRESS
> {code}
> It failed because of the issues from HDDS-4224.
> {code:java}
> 2020-09-09 22:07:14,831 [pool-144-thread-1] ERROR om.OzoneManager
> (OzoneManager.java:installSnapshotFromLeader(3141)) - Failed to install
> snapshot from Leader OM: {}
> java.lang.NullPointerException
> at
> org.apache.hadoop.ozone.om.OzoneManager.installCheckpoint(OzoneManager.java:3168)
> at
> org.apache.hadoop.ozone.om.OzoneManager.installCheckpoint(OzoneManager.java:3162)
> at
> org.apache.hadoop.ozone.om.OzoneManager.installSnapshotFromLeader(OzoneManager.java:3139)
> at
> org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$notifyInstallSnapshotFromLeader$4(OzoneManagerStateMachine.java:372)
> at
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}
>
> The checkpoint is left in the snapshot directory.
> {code:java}
> ➜ chaos-2020-09-09-22-05-33-IST ls
> MiniOzoneClusterImpl-71baac34-2321-4756-ba1e-5834c5628047/omNode-2/ratis/snapshot/om.db-omNode-1-1599669
> om.db-omNode-1-1599669432684/ om.db-omNode-1-1599669451421/
> om.db-omNode-1-1599669478149/ om.db-omNode-1-1599669504818/
> om.db-omNode-1-1599669533577/ om.db-omNode-1-1599669566509/
> om.db-omNode-1-1599669433775/ om.db-omNode-1-1599669453030/
> om.db-omNode-1-1599669480273/ om.db-omNode-1-1599669507385/
> om.db-omNode-1-1599669535603/ om.db-omNode-1-1599669568325/
> om.db-omNode-1-1599669434867/ om.db-omNode-1-1599669454688/
> om.db-omNode-1-1599669482206/ om.db-omNode-1-1599669509373/
> om.db-omNode-1-1599669537716/ om.db-omNode-1-1599669570186/
> om.db-omNode-1-1599669435886/ om.db-omNode-1-1599669456346/
> om.db-omNode-1-1599669484256/ om.db-omNode-1-1599669511241/
> om.db-omNode-1-1599669540574/ om.db-omNode-1-1599669572150/
> om.db-omNode-1-1599669437199/ om.db-omNode-1-1599669458194/
> om.db-omNode-1-1599669486200/ om.db-omNode-1-1599669513051/
> om.db-omNode-1-1599669543136/ om.db-omNode-1-1599669574811/
> om.db-omNode-1-1599669438519/ om.db-omNode-1-1599669459992/
> om.db-omNode-1-1599669487968/ om.db-omNode-1-1599669515343/
> om.db-omNode-1-1599669546272/ om.db-omNode-1-1599669576833/
> om.db-omNode-1-1599669439819/ om.db-omNode-1-1599669461897/
> om.db-omNode-1-1599669490218/ om.db-omNode-1-1599669517332/
> om.db-omNode-1-1599669548363/ om.db-omNode-1-1599669578680/
> om.db-omNode-1-1599669441209/ om.db-omNode-1-1599669463871/
> om.db-omNode-1-1599669492005/ om.db-omNode-1-1599669519320/
> om.db-omNode-1-1599669551596/ om.db-omNode-1-1599669580427/
> om.db-omNode-1-1599669442606/ om.db-omNode-1-1599669465810/
> om.db-omNode-1-1599669493727/ om.db-omNode-1-1599669521491/
> om.db-omNode-1-1599669554153/ om.db-omNode-1-1599669582124/
> om.db-omNode-1-1599669443967/ om.db-omNode-1-1599669467909/
> om.db-omNode-1-1599669495587/ om.db-omNode-1-1599669523436/
> om.db-omNode-1-1599669556370/ om.db-omNode-1-1599669583768/
> om.db-omNode-1-1599669445468/ om.db-omNode-1-1599669470054/
> om.db-omNode-1-1599669497445/ om.db-omNode-1-1599669525567/
> om.db-omNode-1-1599669558461/ om.db-omNode-1-1599669585501/
> om.db-omNode-1-1599669446937/ om.db-omNode-1-1599669472125/
> om.db-omNode-1-1599669499362/ om.db-omNode-1-1599669527648/
> om.db-omNode-1-1599669560578/
> om.db-omNode-1-1599669448360/ om.db-omNode-1-1599669474051/
> om.db-omNode-1-1599669501269/ om.db-omNode-1-1599669529648/
> om.db-omNode-1-1599669562666/
> om.db-omNode-1-1599669449867/ om.db-omNode-1-1599669476078/
> om.db-omNode-1-1599669503036/ om.db-omNode-1-1599669531573/
> om.db-omNode-1-1599669564620/ {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]