errose28 commented on code in PR #3778:
URL: https://github.com/apache/ozone/pull/3778#discussion_r998689250


##########
hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/om/request/key/TestOMKeyCommitRequest.java:
##########
@@ -190,6 +191,93 @@ public void testValidateAndUpdateCache() throws Exception {
         omKeyInfo.getLatestVersionLocations().getLocationList());
   }
 
+  @Test
+  public void testValidateAndUpdateCacheWithUncommittedBlocks()
+      throws Exception {
+
+    // allocated block list
+    List<KeyLocation> allocatedKeyLocationList = getKeyLocation(5, 0);
+
+    List<OmKeyLocationInfo> allocatedBlockList = allocatedKeyLocationList
+        .stream().map(OmKeyLocationInfo::getFromProtobuf)
+        .collect(Collectors.toList());
+
+    // committed block list, with three blocks different with the allocated
+    List<KeyLocation> committedKeyLocationList = getKeyLocation(5, 3);

Review Comment:
   It looks like the purpose of this new `getKeyLocation(int, int)` method is 
to create block lists with some overlap. However the client should never be 
able to commit blocks that were not already allocated in the open key. See 
#2108, which modified `OMKeyInfo#verifyAndGetKeyLocations` to drop the extra 
blocks and added a corresponding unit test in this class. Therefore in this 
test I think we only need to test the client committing a subset of the blocks 
it allocated. We could use the original `getKeyLocation(int)` method to make 
the allocated list, copy this, and just remove a block or two to create the 
commit list without needing a new method.



##########
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/helpers/OmKeyInfo.java:
##########
@@ -214,17 +220,29 @@ public void 
updateLocationInfoList(List<OmKeyLocationInfo> locationInfoList,
       updatedBlockLocations =
           verifyAndGetKeyLocations(locationInfoList, keyLocationInfoGroup);
     }
-    // Updates the latest locationList in the latest version only with
-    // given locationInfoList here.
-    // TODO : The original allocated list and the updated list here may vary
-    // as the containers on the Datanode on which the blocks were pre allocated
-    // might get closed. The diff of blocks between these two lists here
-    // need to be garbage collected in case the ozone client dies.
+
+    // Every time the key commits, the uncommitted blocks should be detected
+    // If client not commit, the blocks should be cleaned by Open Key CleanUp
+    // Service.
+    // keyLocationInfoGroup has the allocated block location info
+    List<OmKeyLocationInfo> uncommittedBlocks =
+        new ArrayList<>(keyLocationInfoGroup.getBlocksLatestVersionOnly());
+    // Only check ContainerBlockID here to avoid the mismatch of the pipeline
+    // field and BcsId in the OmKeyLocationInfo, as the OmKeyInfoCodec ignores
+    // the pipeline field by default and bcsId would be updated in Ratis mode.
+    Set<ContainerBlockID> updatedBlockIDs = updatedBlockLocations.stream().
+        map(BlockLocationInfo::getBlockID).map(BlockID::getContainerBlockID).
+        collect(Collectors.toSet());
+    uncommittedBlocks.removeIf(block -> updatedBlockIDs.
+        contains(block.getBlockID().getContainerBlockID()));
+

Review Comment:
   The allocated block list has already been iterated once in the call to 
`verifyAndGetKeyLocations` above. Can we refactor this so that the checks in 
that method and these new checks can be done in one iteration?
   
   In total we need to identify blocks that are in the allocated list but not 
in the commit list, and blocks that are in the commit list but not in the 
allocated list. We could
   1. Convert the allocated list to a set.
   2. Iterate the commit list.
       - if the current block in the commit list is not in the allocated set, 
drop it and log a warning like the code currently does.
       - else remove the current block in the commit list from the allocated 
set.
    3. The blocks remaining in the allocated set can be returned as the 
uncommitted blocks to be deleted.
    4. The blocks left in the commit list are ok to be committed.
   
   This code is on the main write path, so we should try to make it as 
efficient as possible.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to