singhpk234 commented on a change in pull request #3676:
URL: https://github.com/apache/iceberg/pull/3676#discussion_r826185680



##########
File path: core/src/main/java/org/apache/iceberg/actions/BinPackStrategy.java
##########
@@ -155,9 +155,12 @@ public RewriteStrategy options(Map<String, String> 
options) {
   public Iterable<List<FileScanTask>> planFileGroups(Iterable<FileScanTask> 
dataFiles) {
     ListPacker<FileScanTask> packer = new 
BinPacking.ListPacker<>(maxGroupSize, 1, false);
     List<List<FileScanTask>> potentialGroups = packer.pack(dataFiles, 
FileScanTask::length);
+
     return potentialGroups.stream().filter(group ->
-      group.size() >= minInputFiles || sizeOfInputFiles(group) > 
targetFileSize ||
-              group.stream().anyMatch(this::taskHasTooManyDeletes)
+            (group.size() >= minInputFiles && group.size() > 1) ||
+                sizeOfInputFiles(group) > targetFileSize ||
+                group.stream().anyMatch(this::taskHasTooManyDeletes) ||
+                (group.stream().anyMatch(this::taskHasDeletes) && group.size() 
== 1)

Review comment:
       was trying to handle this in 
[comment](https://github.com/apache/iceberg/issues/3236#issuecomment-989504872):
   > A group with a single file that is modified by deletes should be rewritten 
(if the rewrite deletes filter is on)
   
   here in previous scenario if minInputFiles was 1 and group.size() = 1, even 
if the group didn't pass the deleteFileThreshold (checked in 
taskHasToManyDeletes) we would have planned it, I thought by the line above if 
there are groups with 1 file effected by deletes we should attempt to plan it, 
hence skipped delete threshold check.
   Am I missing something here ?
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to