dimas-b commented on code in PR #4850:
URL: https://github.com/apache/polaris/pull/4850#discussion_r3461105951


##########
runtime/service/src/main/java/org/apache/polaris/service/task/BatchFileCleanupTaskHandler.java:
##########
@@ -77,17 +77,12 @@ public boolean handleTask(TaskEntity task, CallContext 
callContext) {
                 missingFiles.size());
       }
 
-      // Schedule the deletion for each file asynchronously
-      List<CompletableFuture<Void>> deleteFutures =
-          validFiles.stream()
-              .map(file -> super.tryDelete(tableId, authorizedFileIO, null, 
file, null, 1))
-              .toList();
+      CompletableFuture<Void> deleteFutures =
+          tryDelete(
+              tableId, authorizedFileIO, validFiles, 
cleanupTask.type().getValue(), true, null, 1);

Review Comment:
   My take is that the Iceberg java library is biases toward execution in 
engine-like environments (which is fair).
   
   Polaris being a server should probably have full control over its thread 
pools.
   
   So, from my POV it would be preferable to copy the batch delete logic into 
Polaris code to allow it to fall back on its own thread pools in case batching 
is not available at the FileIO level.
   
   Speaking of the latter, does can manifest in practice in Polaris? Only S3, 
GCS and Azure are supported ATM.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to