Xushaohong commented on code in PR #3778:
URL: https://github.com/apache/ozone/pull/3778#discussion_r1000291110
##########
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/helpers/OmKeyInfo.java:
##########
@@ -214,17 +220,29 @@ public void
updateLocationInfoList(List<OmKeyLocationInfo> locationInfoList,
updatedBlockLocations =
verifyAndGetKeyLocations(locationInfoList, keyLocationInfoGroup);
}
- // Updates the latest locationList in the latest version only with
- // given locationInfoList here.
- // TODO : The original allocated list and the updated list here may vary
- // as the containers on the Datanode on which the blocks were pre allocated
- // might get closed. The diff of blocks between these two lists here
- // need to be garbage collected in case the ozone client dies.
+
+ // Every time the key commits, the uncommitted blocks should be detected
+ // If client not commit, the blocks should be cleaned by Open Key CleanUp
+ // Service.
+ // keyLocationInfoGroup has the allocated block location info
+ List<OmKeyLocationInfo> uncommittedBlocks =
+ new ArrayList<>(keyLocationInfoGroup.getBlocksLatestVersionOnly());
+ // Only check ContainerBlockID here to avoid the mismatch of the pipeline
+ // field and BcsId in the OmKeyLocationInfo, as the OmKeyInfoCodec ignores
+ // the pipeline field by default and bcsId would be updated in Ratis mode.
+ Set<ContainerBlockID> updatedBlockIDs = updatedBlockLocations.stream().
+ map(BlockLocationInfo::getBlockID).map(BlockID::getContainerBlockID).
+ collect(Collectors.toSet());
+ uncommittedBlocks.removeIf(block -> updatedBlockIDs.
+ contains(block.getBlockID().getContainerBlockID()));
+
Review Comment:
The point that saving extra iteration and doing it in
**verifyAndGetKeyLocations** is a good idea!
We need to compare two lists, but the key to distinguishing is the
**ContainerBlockID** field.
So we should not simply convert the allocated list to a set to avoid
iterating the allocated list each time when checking if the
**ContainerBlockID** matches.
Now I use the HashMap to store the ContainerBlockID as the key and thus save
the time complexity. The cost is relatively small and removes the extra
iteration of allocated block.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]