rajarshisarkar commented on a change in pull request #4377:
URL: https://github.com/apache/iceberg/pull/4377#discussion_r834209172



##########
File path: api/src/main/java/org/apache/iceberg/actions/RewriteDataFiles.java
##########
@@ -88,6 +88,15 @@
   String USE_STARTING_SEQUENCE_NUMBER = "use-starting-sequence-number";
   boolean USE_STARTING_SEQUENCE_NUMBER_DEFAULT = true;
 
+  /**
+   * Forces the compaction order according to the partition size in decreasing 
order, instead of

Review comment:
       I would let the users control the comparator by two options: 
   - If `rewrite-partitions-by-bytes` is enabled by the user, then we should 
sort the partitions by bytes (higher to lower).
   - If `rewrite-partitions-by-num-files` is enabled by the user, then we 
should sort the partitions by number of files in it (higher to lower).
   - If nothing is enabled by the user, then we should use the natural 
partition order.
   - User cannot enable both the options.
   
   In the above example, 
   
   If `rewrite-partitions-by-bytes` is enabled then file group 1 (100 GB + 2GB 
= 102 GB) should go before file group 2 (50 GB). 
   If `rewrite-partitions-by-num-files` the file group having more files should 
go before.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to