RussellSpitzer commented on code in PR #7585:
URL: https://github.com/apache/iceberg/pull/7585#discussion_r1204808311


##########
spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java:
##########
@@ -1384,6 +1384,104 @@ public void testRewriteJobOrderFilesDesc() {
     Assert.assertNotEquals("Number of files order should not be ascending", 
actual, expected);
   }
 
+  @Test
+  public void testZOrderSortPartitionEvolution() {
+    int originalFiles = 20;
+    Table table = createTable(originalFiles);
+    shouldHaveLastCommitUnsorted(table, "c2");
+    shouldHaveFiles(table, originalFiles);
+
+    table
+        .updateSpec()
+        .addField(Expressions.bucket("c1", 2))
+        .addField(Expressions.bucket("c2", 2))
+        .commit();
+
+    long dataSizeBefore = testDataSize(table);
+
+    RewriteDataFiles.Result result =
+        basicRewrite(table)
+            .zOrder("c2", "c3")
+            .option(
+                SortStrategy.MAX_FILE_SIZE_BYTES,
+                Integer.toString((averageFileSize(table) / 2) + 2))
+            // Divide files in 2
+            .option(
+                RewriteDataFiles.TARGET_FILE_SIZE_BYTES,
+                Integer.toString(averageFileSize(table) / 2))
+            .option(SortStrategy.MIN_INPUT_FILES, "1")
+            .execute();
+
+    Assert.assertEquals("Should have 1 fileGroups", 1, 
result.rewriteResults().size());
+    assertThat(result.rewrittenBytesCount()).isEqualTo(dataSizeBefore);
+  }
+
+  @Test
+  public void testRewriteWithDifferentOutputSpecIds() {
+    Table table = createTable(10);
+    shouldHaveFiles(table, 10);
+
+    // simulate multiple partition specs with different commit
+    table.updateSpec().addField(Expressions.truncate("c2", 2)).commit();
+    table.updateSpec().addField(Expressions.bucket("c3", 2)).commit();
+
+    performRewriteAndAssertForAllTableSpecs(table, "bin-pack");
+    performRewriteAndAssertForAllTableSpecs(table, "sort");
+    performRewriteAndAssertForAllTableSpecs(table, "zOrder");
+  }
+
+  private void performRewriteAndAssertForAllTableSpecs(Table table, String 
strategy) {

Review Comment:
   I still think we are trying to hide too much in helper functions here. The 
function is difficult to read and assumes a lot about the current state of the 
system. Can we try to limit the state implied? For example asserting that the 
table.specs() has size 3 is a bit odd.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to