smengcl opened a new pull request, #4701:
URL: https://github.com/apache/ozone/pull/4701

   ## What changes were proposed in this pull request?
   
   During snapshot creation, access to `deletedTable` and 
`deletedDirectoryTable` would need to be synchronized with `KeyDeletingTask` 
and `DirDeletingTask` to avoid out-of-order access (read/write) messing up 
either table.
   
   Here are the code logics that justify the locks:
   
   ### `createOmSnapshotCheckpoint` flow, called from 
`OMSnapshotCreateResponse#addToDBBatch`
   
   1. Acquire `getTableLock(deletedDirectoryTable)` write lock, then 
acquire`getTableLock(deletedTable)` write lock
   2. In `deletedTable`, remove all keys with prefix matching snapshot scope 
path (bucket)
   3. In `deletedDirectoryTable`, remove all keys with prefix matching snapshot 
scope path (bucket)
   4. Release `getTableLock(deletedTable)` write lock, then 
release`getTableLock(deletedDirectoryTable)` write lock
   
   ### `KeyDeletingTask#call` flow
   
   1. Acquire `getTableLock(deletedTable)` write lock
   2. `getPendingDeletionKeys()`: (currently) retrieves a number of keys from 
active DB's `deletedTable`
   3. `processKeyDeletes()`: delete key blocks with SCM client 
`deleteKeyBlocks()`, submits `PurgeKeysRequest` Ratis request which then 
removes successfully reclaimed keys from active `deletedTable`
   4. Release `getTableLock(deletedTable)` write lock
   
   ### `DirDeletingTask#call` flow
   
   1. Acquire `getTableLock(deletedDirectoryTable)` write lock
   2. Iterate over active `deletedDirectoryTable`, prepare a list of 
`PurgePathRequest`s, each contains immediate children (keys and dirs) under 
this directory.
   3. Acquire `getTableLock(deletedTable)` write lock
   4. `optimizeDirDeletesAndSubmitRequest()`: recurse further into sub-dirs if 
batch limit `pathLimitPerTask` isn't reached. Q: Can we refactor the same dir 
expansion logic? 
[One](https://github.com/apache/ozone/blob/dd003040a41def491e8de003ef8539ce40854972/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/AbstractKeyDeletingService.java#L356-L380),
 
[Two](https://github.com/apache/ozone/blob/fb15c0514252518dcd445936813d1f7ab21b8bc9/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/DirectoryDeletingService.java#L136-L158),
 
[Three](https://github.com/apache/ozone/blob/4578a063533bc1396a218a69613a842ff0b32ec6/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDeletingService.java#L352-L375)
 @aswinshakil 
   5. Submit `PurgePathRequest`s to Ratis
   6. Release `getTableLock(deletedTable)` write lock
   7. Release `getTableLock(deletedDirectoryTable)` write lock
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-8067
   
   ## How was this patch tested?
   
   - All existing tests should pass.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to