nsivabalan commented on code in PR #18016:
URL: https://github.com/apache/hudi/pull/18016#discussion_r2748068032


##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieCommitMetadata.java:
##########
@@ -490,6 +491,18 @@ public HashSet<String> getWritePartitionPaths() {
     return new HashSet<>(partitionToWriteStats.keySet());
   }
 
+  public Set<String> getWritePartitionPathsWithExistingFileGroupsModified() {

Review Comment:
   how about `getWritePartitionPathsWithUpdatedFileGroups` 



##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieReplaceCommitMetadata.java:
##########
@@ -82,6 +85,12 @@ public String toJsonString() throws IOException {
     return 
JsonUtils.getObjectMapper().writerWithDefaultPrettyPrinter().writeValueAsString(this);
   }
 
+  @Override
+  public Set<String> getWritePartitionPathsWithExistingFileGroupsModified() {
+    return 
Stream.concat(super.getWritePartitionPathsWithExistingFileGroupsModified().stream(),

Review Comment:
   for commit metadata, I understand its little tricky. i.e. we need to 
differentiate new file groups vs new file slices for an existing file group. 
   but for replace commit, every partitions involved should be included right.
   
   clustering -> every partition in the replace commit metadata will have 
replace fileIDs. 
   insert_overwrite_table and insert_overwrite: same is the case here
   delete_partition -> same here as well. 
   
   so, why can't we keep it simple for replace commit metadata
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to