smengcl commented on code in PR #5167:
URL: https://github.com/apache/ozone/pull/5167#discussion_r1290460459
##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyCommitRequest.java:
##########
@@ -242,7 +253,16 @@ public OMClientResponse
validateAndUpdateCache(OzoneManager ozoneManager,
if (null == oldKeyVersionsToDeleteMap) {
oldKeyVersionsToDeleteMap = new HashMap<>();
}
- oldKeyVersionsToDeleteMap.put(delKeyName, oldVerKeyInfo);
+
+ // Remove any block from oldVerKeyInfo that share the same container ID
+ // and local ID with omKeyInfo blocks'.
Review Comment:
Thanks @sumitagrawl . Similar idea also occurred to us but I wasn't
proficient in HSync block commit flow so @errose28 , @jojochuang and I decided
to take the generic filtering approach, for now.
1. I briefly tried out your patch on top of this PR branch. I get the idea.
However, if you comment out the call to the filtering logic:
https://github.com/apache/ozone/blob/5661ad98616245606baffd09c86472d6af9c2968/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyCommitRequestWithFSO.java#L214
and run the new integration test, you'll see `OMKeyCommitRequestWithFSO` is
still putting the key block into `deletedTable`:
```
org.opentest4j.AssertionFailedError: deletedTable should not have such
entry. key = /volume13960/bucket38943/file-hsync-then-close/4611686018427388673
at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:39)
at org.junit.jupiter.api.Assertions.fail(Assertions.java:134)
at
org.apache.hadoop.fs.ozone.TestHSync.testKeyHSyncThenClose(TestHSync.java:184)
```
That is because when it is reading the `omKeyInfo` during the second
non-HSync commit, `omKeyInfo.getMetadata()` is empty. So
`isPreviousCommitHsync` is always `false` (at least in FSO).
In turn this is because `omKeyInfo` is retrieved from `OpenKeyTable`:
https://github.com/apache/ozone/blob/5661ad98616245606baffd09c86472d6af9c2968/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyCommitRequestWithFSO.java#L153-L154
And in KeyCommitRequest, `omKeyInfo` with the HSYNC metadata is only updated
in `KeyTable` (`FileTable`), not `OpenKeyTable`:
https://github.com/apache/ozone/blob/5661ad98616245606baffd09c86472d6af9c2968/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyCommitRequestWithFSO.java#L252-L253
Therefore, the .patch file by itself looks incomplete. Would you raise a
another PR later, once the current filtering patch is merged?
2. Another problem is, currently I see no check to ensure the same
`clientId` is doing the commit in HSync key commit flow. Therefore, if another
client with different ClientID (rogue or not) tries to commit this key without
HSync, `isPreviousCommitHsync` will be `false` according to your patch. Then
again it will fall into the second branch, which queues the block for deletion
and in turn cause data loss if the block filtering logic does not exist.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]