TheR1sing3un commented on code in PR #12344:
URL: https://github.com/apache/hudi/pull/12344#discussion_r1889678771


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bucket/ConsistentBucketIndexUtils.java:
##########
@@ -117,55 +116,74 @@ public static Option<HoodieConsistentHashingMetadata> 
loadMetadata(HoodieTable t
         return filename.contains(HASHING_METADATA_FILE_SUFFIX);
       };
       final List<StoragePathInfo> metaFiles = 
metaClient.getStorage().listDirectEntries(metadataPath);
-      final TreeSet<String> commitMetaTss = 
metaFiles.stream().filter(hashingMetaCommitFilePredicate)
-          .map(commitFile -> 
HoodieConsistentHashingMetadata.getTimestampFromFile(commitFile.getPath().getName()))
-          .sorted()
-          .collect(Collectors.toCollection(TreeSet::new));
-      final List<StoragePathInfo> hashingMetaFiles = 
metaFiles.stream().filter(hashingMetadataFilePredicate)
-          .sorted(Comparator.comparing(f -> f.getPath().getName()))
+
+      final TreeMap<String/*instantTime*/, Pair<StoragePathInfo/*hash metadata 
file path*/, Boolean/*commited*/>> versionedHashMetadataFiles = 
metaFiles.stream()
+          .filter(hashingMetadataFilePredicate)
+          .map(metaFile -> {
+            String instantTime = 
HoodieConsistentHashingMetadata.getTimestampFromFile(metaFile.getPath().getName());
+            return Pair.of(instantTime, Pair.of(metaFile, false));
+          })
+          .sorted(Collections.reverseOrder())
+          .collect(Collectors.toMap(Pair::getLeft, Pair::getRight, (a, b) -> 
a, TreeMap::new));
+
+      metaFiles.stream().filter(hashingMetaCommitFilePredicate)
+          .forEach(commitFile -> {
+            String instantTime = 
HoodieConsistentHashingMetadata.getTimestampFromFile(commitFile.getPath().getName());
+            if (!versionedHashMetadataFiles.containsKey(instantTime)) {

Review Comment:
   > Let's try to fix the issues one by one
   
   As you said, after we merging the wrong path problem fixing pr, I raise a 
new pr to refactor consistent-hash-metadata management for better code 
readability and maintainability, lets keep reviewing at: 
https://github.com/apache/hudi/pull/12512



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to