RussellSpitzer commented on code in PR #4902:
URL: https://github.com/apache/iceberg/pull/4902#discussion_r916887269
##########
spark/v3.2/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRewriteDataFilesProcedure.java:
##########
@@ -133,6 +145,43 @@ public void testRewriteDataFilesWithSortStrategy() {
assertEquals("Data after compaction should not change", expectedRecords,
actualRecords);
}
+ @Test
+ public void testRewriteDataFilesWithZOrder() {
+ createTable();
+ // create 10 files under non-partitioned table
+ insertData(10);
+ List<Object[]> expectedRecords = currentData();
+
+ // set z_order = c1,c2
+ List<Object[]> output = sql(
+ "CALL %s.system.rewrite_data_files(table => '%s', " +
+ "strategy => 'sort', sort_order => 'zorder(c1,c2)')",
+ catalogName, tableIdent);
+
+ assertEquals("Action should rewrite 10 data files and add 1 data files",
+ ImmutableList.of(row(10, 1)),
+ output);
+
+ List<Object[]> actualRecords = currentData();
+ assertEquals("Data after compaction should not change", expectedRecords,
actualRecords);
+
+ // Due to Z_order, the data written will be in the below order.
+ // As there is only one small output file, we can validate the query
ordering (as it will not change).
+ ImmutableList<Object[]> expectedRows = ImmutableList.of(
Review Comment:
The principal I would go for here is "property testing" where instead of
attempting to assert an absolute, "This operation provides this order" we say
something like "This operation provides an order that is different than another
order". That way we can change the algorithm and this test (which doesn't
actually check the correctness of the algorithm it only checks whether
something happened) doesn't have to change.
So like in this case we could check that the order of the data is different
than the hierarchal sorted data and also different than the original ordering
of the data (without any sort or zorder).
That said we can always skip this for now, but In general I try to avoid
tests with absolute answers when we aren't trying to make sure that we get that
specific answer in the test.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]