danny0405 commented on code in PR #8717:
URL: https://github.com/apache/hudi/pull/8717#discussion_r1194622214


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/ListingBasedRollbackStrategy.java:
##########
@@ -194,19 +194,15 @@ private String getBaseFileExtension(HoodieTableMetaClient 
metaClient) {
   }
 
   @NotNull
-  private HoodieRollbackRequest getHoodieRollbackRequest(String partitionPath, 
FileStatus[] filesToDeletedStatus) {
-    List<String> filesToDelete = getFilesToBeDeleted(filesToDeletedStatus);
-    return new HoodieRollbackRequest(
-        partitionPath, EMPTY_STRING, EMPTY_STRING, filesToDelete, 
Collections.emptyMap());
-  }
-
-  @NotNull
-  private List<String> getFilesToBeDeleted(FileStatus[] 
dataFilesToDeletedStatus) {
-    return Arrays.stream(dataFilesToDeletedStatus).map(fileStatus -> {
-      String dataFileToBeDeleted = fileStatus.getPath().toString();
-      // strip scheme E.g: file:/var/folders
-      return dataFileToBeDeleted.substring(dataFileToBeDeleted.indexOf(":") + 
1);
-    }).collect(Collectors.toList());
+  private List<HoodieRollbackRequest> getHoodieRollbackRequests(String 
partitionPath, FileStatus[] filesToDeletedStatus) {
+    return Arrays.stream(filesToDeletedStatus)
+        .map(fileStatus -> {
+          String dataFileToBeDeleted = fileStatus.getPath().toString();
+          // strip scheme E.g: file:/var/folders

Review Comment:
   So the parallelism is configured to as file granuality(currently it is 
partition based rollback), and now the max parallelism could be the max number 
of files to clean:
   
   ```java
   Math.max(Math.min(rollbackRequests.size(), config.getRollbackParallelism()), 
1)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to