rajarshisarkar commented on code in PR #4377:
URL: https://github.com/apache/iceberg/pull/4377#discussion_r844893011
##########
spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java:
##########
@@ -1038,6 +1046,157 @@ public void testInvalidAPIUsage() {
"Cannot set strategy", () ->
actions().rewriteDataFiles(table).sort(SortOrder.unsorted()).binPack());
}
+ @Test
+ public void testRewriteJobOrderBytesAsc() {
+ Table table = createTablePartitioned(4, 2);
+ shouldHaveFiles(table, 8);
+ List<Object[]> expectedRecords = currentData();
+ table.updateProperties().set(TableProperties.FORMAT_VERSION, "2").commit();
+ table.refresh();
+
+ BaseRewriteDataFilesSparkAction rewrite =
+ (org.apache.iceberg.spark.actions.BaseRewriteDataFilesSparkAction)
+ basicRewrite(table)
+ .option(RewriteDataFiles.REWRITE_JOB_ORDER,
RewriteJobOrder.BYTES_ASC.orderName())
+ .binPack();
+ Map<StructLike, List<List<FileScanTask>>> fileGroupsByPartition =
+ rewrite.planFileGroups(table.currentSnapshot().snapshotId());
+ Result result = rewrite.execute();
+
+ Assert.assertEquals("Action should rewrite 8 data files", 8,
result.rewrittenDataFilesCount());
+ Assert.assertEquals("Action should add 4 data file", 4,
result.addedDataFilesCount());
+ shouldHaveFiles(table, 4);
+ List<Object[]> actualRecords = currentData();
+ assertEquals("Rows must match", expectedRecords, actualRecords);
+
+ Stream<RewriteFileGroup> groupStream = rewrite.toGroupStream(
Review Comment:
Thanks for the feedback, I have dropped the previous part and have compared
the order of size in bytes/number of files with and without the option.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]