amogh-jahagirdar commented on code in PR #17039:
URL: https://github.com/apache/iceberg/pull/17039#discussion_r3521029107


##########
spark/v4.0/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java:
##########
@@ -2069,6 +2069,36 @@ public void 
testUnpartitionedRewriteDataFilesPreservesLineage() throws NoSuchTab
     assertEquals("Rows must match", expectedRecordsWithLineage, 
actualRecordsWithLineage);
   }
 
+  @TestTemplate

Review Comment:
   Ok here's a more minimal test I came up with so we don't even have to append 
any records. We can just do a compact + upgrade + rewrite manifests to move the 
new file into a single v3 manifest. since there's a single data file/single 
manifest we can then reliably assert the row IDs regardless of any change in 
the implementation.
   
   ```
   +  @TestTemplate
   +  public void 
testLastUpdatedSequenceInheritedFromDataSequenceAfterUpgrade() {
   +    assumeThat(formatVersion).isEqualTo(2);
   +
   +    Table table = createTable();
   +
   +    // Two data files in a single v2 commit; both share data sequence 
number 1.
   +    writeRecords(2 /* files */, 4 /* records */);
   +    table.refresh();
   +    shouldHaveFiles(table, 2);
   +    long committedDataSequence = table.currentSnapshot().sequenceNumber();
   +
   +    basicRewrite(table).execute();
   +    table.refresh();
   +    shouldHaveFiles(table, 1);
   +
   +    DataFile compacted = Iterables.getOnlyElement(currentDataFiles(table));
   +    long dataSequence = compacted.dataSequenceNumber();
   +    long fileSequence = compacted.fileSequenceNumber();
   +    assertThat(dataSequence)
   +        .as("Compaction must preserve the original data sequence number")
   +        .isEqualTo(committedDataSequence);
   +    assertThat(fileSequence)
   +        .as("Compaction must bump the file sequence above the preserved 
data sequence")
   
   +    shouldHaveFiles(table, 1);
   +
   +    DataFile compacted = Iterables.getOnlyElement(currentDataFiles(table));
   +    long dataSequence = compacted.dataSequenceNumber();
   +    long fileSequence = compacted.fileSequenceNumber();
   +    assertThat(dataSequence)
   +        .as("Compaction must preserve the original data sequence number")
   +        .isEqualTo(committedDataSequence);
   +    assertThat(fileSequence)
   +        .as("Compaction must bump the file sequence above the preserved 
data sequence")
   +        .isGreaterThan(dataSequence);
   +
   +    // Upgrade to v3, then move the carried-over file into a new v3 
manifest 
   +    table.updateProperties().set(TableProperties.FORMAT_VERSION, 
"3").commit();
   +    table.rewriteManifests().rewriteIf(manifest -> true).commit();
   +    table.refresh();
   +
   +    List<Object[]> expected =
   +        Lists.newArrayList(
   +            row(0L, dataSequence, ANY, ANY, ANY),
   +            row(1L, dataSequence, ANY, ANY, ANY),
   +            row(2L, dataSequence, ANY, ANY, ANY),
   +            row(3L, dataSequence, ANY, ANY, ANY));
   +
   +    assertEquals(
   +        "Row IDs must be 0..3 and last-updated must inherit the data 
sequence",
   +        expected,
   +        currentDataWithLineage());
   +  }
   
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to