rdblue commented on code in PR #6090:
URL: https://github.com/apache/iceberg/pull/6090#discussion_r1053500032


##########
core/src/test/java/org/apache/iceberg/TestRemoveSnapshots.java:
##########
@@ -1234,6 +1245,74 @@ public void 
testMultipleRefsAndCleanExpiredFilesFailsForIncrementalCleanup() {
                 .commit());
   }
 
+  @Test
+  public void testExpireWithStatisticsFiles() throws URISyntaxException, 
IOException {
+    table.newAppend().appendFile(FILE_A).commit();
+    File statsFileLocation1 = statsFileLocation(table);
+    StatisticsFile statisticsFile1 = writeStatsFileForCurrentSnapshot(table, 
statsFileLocation1);
+    Assert.assertEquals(
+        "Must match the latest snapshot",
+        table.currentSnapshot().snapshotId(),
+        statisticsFile1.snapshotId());
+
+    table.newAppend().appendFile(FILE_B).commit();
+    File statsFileLocation2 = statsFileLocation(table);
+    StatisticsFile statisticsFile2 = writeStatsFileForCurrentSnapshot(table, 
statsFileLocation2);
+    Assert.assertEquals(
+        "Must match the latest snapshot",
+        table.currentSnapshot().snapshotId(),
+        statisticsFile2.snapshotId());
+
+    Assert.assertEquals("Should have 2 statistics file", 2, 
table.statisticsFiles().size());
+
+    table.updateProperties().set(TableProperties.MAX_SNAPSHOT_AGE_MS, 
"1").commit();
+
+    removeSnapshots(table).commit();
+
+    Assert.assertEquals("Should keep 1 snapshot", 1, 
Iterables.size(table.snapshots()));
+    Assertions.assertThat(table.statisticsFiles())
+        .hasSize(1)
+        .extracting(StatisticsFile::snapshotId)
+        .as("Should contain only the statistics file of snapshot2")
+        .isEqualTo(Lists.newArrayList(statisticsFile2.snapshotId()));
+    Assertions.assertThat(statsFileLocation1.exists()).isFalse();
+    Assertions.assertThat(statsFileLocation2.exists()).isTrue();
+  }
+
+  @Test
+  public void testExpireWithStatisticsFilesWithReuse() throws 
URISyntaxException, IOException {
+    table.newAppend().appendFile(FILE_A).commit();
+    File statsFileLocation1 = statsFileLocation(table);
+    StatisticsFile statisticsFile1 = writeStatsFileForCurrentSnapshot(table, 
statsFileLocation1);
+    Assert.assertEquals(
+        "Must match the latest snapshot",
+        table.currentSnapshot().snapshotId(),
+        statisticsFile1.snapshotId());
+
+    table.newAppend().appendFile(FILE_B).commit();
+    // reuse the existing stats file with the current snapshot
+    StatisticsFile statisticsFile2 = reuseStatsForCurrentSnapshot(table, 
statisticsFile1);

Review Comment:
   I can understand why you'd want to test that reused files are not removed. 
However, this metadata is invalid so I think this test should have a warning 
comment at the top about why it is testing this case and that it is not 
expected in real tables.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to