hemantk-12 opened a new pull request, #5262:
URL: https://github.com/apache/ozone/pull/5262

   ## What changes were proposed in this pull request?
   Currently, OM restart fails and leaves the OM in unstable (which can't be 
stabilized without manual effort) if there is any kind of failure in snapshot 
chain reconstruction. Previous issues 
([HDDS-7689](https://issues.apache.org/jira/browse/HDDS-7689), 
[HDDS-8530](https://issues.apache.org/jira/browse/HDDS-8530), 
[HDDS-8832](https://issues.apache.org/jira/browse/HDDS-8832) and 
[HDDS-9073](https://issues.apache.org/jira/browse/HDDS-9073)) where OM restart 
failed and left it in unstable state.
   
   In this change, it is proposed to keep the snapshot chain state and let the 
OM start without any snapshot/chain related issues. Snapshot chain corrupted 
state is used to block all the snapshot's write operations like create, delete 
and purge snapshot if snapshot chain is corrupted.
   
   A new parameter, `snapshotChainCorrupted`, is added to 
`SnapshotChainManager` to keep the state if chain is loaded successfully or 
not. It is used in access methods `addSnapshot()` to chain, `deleteSnapshot()` 
from chain and few others before returning response to the caller. In case of 
chain corruption, `addSnapshot()`, `deleteSnapshot()` will throw 
`IllegalStateException` and fail snapshot write operations.
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-9199
   
   ## How was this patch tested?
   Updated unit and integration tests.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to