xloya commented on code in PR #4427:
URL: https://github.com/apache/iceberg/pull/4427#discussion_r843412283


##########
core/src/main/java/org/apache/iceberg/util/Tasks.java:
##########
@@ -212,7 +212,9 @@ public boolean run(Task<I, RuntimeException> task) {
           I item = iterator.next();
           try {
             runTaskWithRetry(task, item);
-            succeeded.add(item);
+            if (revertTask != null) {
+              succeeded.add(item);
+            }

Review Comment:
   Let me explain why the change is needed here:  
   At present, when rewriting, it will be rewritten with the `RewriteFileGroup` 
as the rewrite unit through the `Tasks` tool class without `revertWith` method: 
 
   
(https://github.com/apache/iceberg/blob/56d8f07ce6cf7ac99f439848ebf7c8b86f5046df/spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/actions/BaseRewriteDataFilesSparkAction.java#L307-L313)
   But at this time, after the rewrite is finished, there is no need to revert 
the successful task. As you can see from the following code, unless we call and 
set `revertTask`, the logic here will not be executed anyway:  
   
(https://github.com/apache/iceberg/blob/56d8f07ce6cf7ac99f439848ebf7c8b86f5046df/core/src/main/java/org/apache/iceberg/util/Tasks.java#L126-L129)
  
   
https://github.com/apache/iceberg/blob/56d8f07ce6cf7ac99f439848ebf7c8b86f5046df/core/src/main/java/org/apache/iceberg/util/Tasks.java#L356-L380
   But the current logic is that no matter what, the reference of the 
successful task will be saved to the concurrent queue of succeeded, and the 
`RewriteFileGroup` that has been rewritten will also be saved. If we want to 
release the memory of `RewriteFileGroup`, then we should also remove 
unnecessary references to it here  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to