[
https://issues.apache.org/jira/browse/HDDS-8530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17720064#comment-17720064
]
Hemant Kumar commented on HDDS-8530:
------------------------------------
There is one more problem in current creationTime implementation, snapshotName
would not be unique among OM nodes when client doesn't pass snapshot name
because it would be based on currentTime of the OM node.
https://github.com/apache/ozone/blob/master/hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/helpers/SnapshotInfo.java#L486
> [snapshot] OM crash on restart due to Snapshot Chain corruption
> ---------------------------------------------------------------
>
> Key: HDDS-8530
> URL: https://issues.apache.org/jira/browse/HDDS-8530
> Project: Apache Ozone
> Issue Type: Bug
> Components: Ozone Manager
> Reporter: Jyotirmoy Sinha
> Priority: Major
> Labels: ozone-snapshot
> Attachments: Screenshot 2023-05-05 at 5.20.22 PM.png, Screenshot
> 2023-05-05 at 5.51.26 PM.png
>
>
> snapshotTable is sorted lexicographically and assumption that previous
> snapshot always exist is wrong
> OM error stacktrace -
> {code:java}
> 2023-05-03 18:59:48,889 [main] INFO
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer: Can't find SST '032458'
> 2023-05-03 18:59:48,889 [main] INFO
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer: Can't find SST '032459'
> 2023-05-03 18:59:48,890 [main] INFO
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer: Can't find SST '032466'
> 2023-05-03 18:59:48,890 [main] INFO
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer: Can't find SST '032475'
> 2023-05-03 18:59:48,890 [main] INFO
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer: Can't find SST '032480'
> 2023-05-03 18:59:48,890 [main] INFO
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer: Can't find SST '032482'
> 2023-05-03 18:59:48,891 [main] INFO
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer: Can't find SST '032491'
> 2023-05-03 18:59:48,891 [main] INFO
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer: Can't find SST '032495'
> 2023-05-03 18:59:48,891 [main] INFO
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer: Can't find SST '032497'
> 2023-05-03 18:59:48,891 [main] INFO
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer: Can't find SST '032501'
> 2023-05-03 18:59:48,891 [main] INFO
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer: Can't find SST '032506'
> 2023-05-03 18:59:48,892 [main] INFO
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer: Can't find SST '032517'
> 2023-05-03 18:59:48,892 [main] INFO
> org.apache.ozone.rocksdiff.RocksDBCheckpointDiffer: Can't find SST '032525'
> 2023-05-03 18:59:49,270 [main] ERROR
> org.apache.hadoop.ozone.om.OzoneManagerStarter: OM start failed with exception
> java.io.IOException: Snapshot Chain corruption: previous snapshotID given but
> no associated snapshot found in snapshot chain: SnapshotID
> 9384de9d-3e6e-4f18-b4dd-64e69a58f31e
> at
> org.apache.hadoop.ozone.om.SnapshotChainManager.addSnapshotGlobal(SnapshotChainManager.java:86)
> at
> org.apache.hadoop.ozone.om.SnapshotChainManager.addSnapshot(SnapshotChainManager.java:289)
> at
> org.apache.hadoop.ozone.om.SnapshotChainManager.loadFromSnapshotInfoTable(SnapshotChainManager.java:279)
> at
> org.apache.hadoop.ozone.om.SnapshotChainManager.<init>(SnapshotChainManager.java:63)
> at
> org.apache.hadoop.ozone.om.OmMetadataManagerImpl.start(OmMetadataManagerImpl.java:517)
> at
> org.apache.hadoop.ozone.om.OmMetadataManagerImpl.<init>(OmMetadataManagerImpl.java:321)
> at
> org.apache.hadoop.ozone.om.OzoneManager.instantiateServices(OzoneManager.java:762)
> at org.apache.hadoop.ozone.om.OzoneManager.<init>(OzoneManager.java:642)
> at org.apache.hadoop.ozone.om.OzoneManager.createOm(OzoneManager.java:727)
> at
> org.apache.hadoop.ozone.om.OzoneManagerStarter$OMStarterHelper.start(OzoneManagerStarter.java:189)
> at
> org.apache.hadoop.ozone.om.OzoneManagerStarter.startOm(OzoneManagerStarter.java:86)
> at
> org.apache.hadoop.ozone.om.OzoneManagerStarter.call(OzoneManagerStarter.java:74)
> at org.apache.hadoop.hdds.cli.GenericCli.call(GenericCli.java:38)
> at picocli.CommandLine.executeUserObject(CommandLine.java:1953)
> at picocli.CommandLine.access$1300(CommandLine.java:145)
> at
> picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352)
> at picocli.CommandLine$RunLast.handle(CommandLine.java:2346)
> at picocli.CommandLine$RunLast.handle(CommandLine.java:2311)
> at
> picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179)
> at picocli.CommandLine.execute(CommandLine.java:2078)
> at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:100)
> at org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:91)
> at
> org.apache.hadoop.ozone.om.OzoneManagerStarter.main(OzoneManagerStarter.java:58)
> 2023-05-03 18:59:49,273 [shutdown-hook-0] INFO
> org.apache.hadoop.ozone.om.OzoneManagerStarter: SHUTDOWN_MSG: {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]