Swaminathan Balachandran created HDDS-11452:
-----------------------------------------------

             Summary: OmSnapshotPurgeRequest is not atomic and can lead to 
SnapshotChain Corruption
                 Key: HDDS-11452
                 URL: https://issues.apache.org/jira/browse/HDDS-11452
             Project: Apache Ozone
          Issue Type: Sub-task
            Reporter: Swaminathan Balachandran
            Assignee: Swaminathan Balachandran


OmSnapshotPurgeRequest updates the snapshot chain and also updates the cache & 
in case of any failure. In case of checked exception thrown, the request 
gobbles up the exception and returns an error response. But the problem with 
this is, we have partially updated snapshot info table cache which is not 
coherrent with the snapshot chain and this won't be flushed to disk. On restart 
this could lead to all sorts of snapshot chain & snapshot info corruption. 

The proposal here is to make the entire request atomic:

1) Update the snapshot chain & maintain the updated snapshot infos in local 
uncommitted space.

2) In case of an exception, roll back all deleted snapshots from chain by 
putting it back to the snapshot chain & return an error response.

3) If no exception is thrown, update the snapshot info table cache.

4) Send it to double buffer

cc: [~hemantk] [~ppogde] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to