hemantk-12 opened a new pull request, #5262: URL: https://github.com/apache/ozone/pull/5262
## What changes were proposed in this pull request? Currently, OM restart fails and leaves the OM in unstable (which can't be stabilized without manual effort) if there is any kind of failure in snapshot chain reconstruction. Previous issues ([HDDS-7689](https://issues.apache.org/jira/browse/HDDS-7689), [HDDS-8530](https://issues.apache.org/jira/browse/HDDS-8530), [HDDS-8832](https://issues.apache.org/jira/browse/HDDS-8832) and [HDDS-9073](https://issues.apache.org/jira/browse/HDDS-9073)) where OM restart failed and left it in unstable state. In this change, it is proposed to keep the snapshot chain state and let the OM start without any snapshot/chain related issues. Snapshot chain corrupted state is used to block all the snapshot's write operations like create, delete and purge snapshot if snapshot chain is corrupted. A new parameter, `snapshotChainCorrupted`, is added to `SnapshotChainManager` to keep the state if chain is loaded successfully or not. It is used in access methods `addSnapshot()` to chain, `deleteSnapshot()` from chain and few others before returning response to the caller. In case of chain corruption, `addSnapshot()`, `deleteSnapshot()` will throw `IllegalStateException` and fail snapshot write operations. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-9199 ## How was this patch tested? Updated unit and integration tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
