amogh-jahagirdar commented on code in PR #17039:
URL: https://github.com/apache/iceberg/pull/17039#discussion_r3520874422


##########
spark/v4.0/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java:
##########
@@ -2069,6 +2069,36 @@ public void 
testUnpartitionedRewriteDataFilesPreservesLineage() throws NoSuchTab
     assertEquals("Rows must match", expectedRecordsWithLineage, 
actualRecordsWithLineage);
   }
 
+  @TestTemplate

Review Comment:
   Feel free to keep your test name as is, just what i had locally



##########
spark/v4.0/spark/src/test/java/org/apache/iceberg/spark/actions/TestRewriteDataFilesAction.java:
##########
@@ -2069,6 +2069,36 @@ public void 
testUnpartitionedRewriteDataFilesPreservesLineage() throws NoSuchTab
     assertEquals("Rows must match", expectedRecordsWithLineage, 
actualRecordsWithLineage);
   }
 
+  @TestTemplate

Review Comment:
   I think this looks right but I'd cleanup the empty append and do a more 
realistic write, I'd also just use an explicit assertion on the records (just a 
bit more readable/less opaque)
   
   
   ```
   +  @TestTemplate
   +  public void 
testLastUpdatedSequenceInheritedFromDataSequenceAfterUpgrade() {
   +    // Reproduces the inheritance bug for a v2-origin file carried into v3.
   +    assumeThat(formatVersion).isEqualTo(2);
   +
   +    Table table = createTable();
   +    writeRecords(2 /* files */, 4 /* records */);
   +    table.refresh();
   +    shouldHaveFiles(table, 2);
   +    long originalSequenceNumber = table.currentSnapshot().sequenceNumber();
   +    // Perform a compaction before upgrade which preserves the original 
sequence number
   +    basicRewrite(table).execute();
   +    table.refresh();
   +    shouldHaveFiles(table, 1);
   +
   +    DataFile compacted = Iterables.getOnlyElement(currentDataFiles(table));
   +    
assertThat(compacted.dataSequenceNumber()).isEqualTo(originalSequenceNumber);
   +    assertThat(compacted.fileSequenceNumber())
   +        .as("Compaction must bump the file sequence number above the 
preserved data sequence")
   +        .isGreaterThan(originalSequenceNumber);
   +
   +    // Upgrade to v3 so the row lineage metadata columns are exposed on 
read.
   +    table.updateProperties().set(TableProperties.FORMAT_VERSION, 
"3").commit();
   +    table.refresh();
   +
   +    writeRecords(1 /* files */, 4 /* records */);
   +    table.refresh();
   +    long appendSequenceNumber = table.currentSnapshot().sequenceNumber();
   +
   +    // currentDataWithLineage() projects (_row_id, 
_last_updated_sequence_number, *) ordered by
   +    // _row_id; the trailing data columns are irrelevant here, so match 
them with ANY.
   +    List<Object[]> actualLineage = currentDataWithLineage();
   +
   +    List<Object[]> expectedLineage =
   +        Lists.newArrayList(
   +            // The post-upgrade append writes a new file, which is assigned 
the first row IDs (0-3)
   +            // and carries the append commit's sequence number as its 
last-updated sequence.
   +            row(0L, appendSequenceNumber, ANY, ANY, ANY),
   +            row(1L, appendSequenceNumber, ANY, ANY, ANY),
   +            row(2L, appendSequenceNumber, ANY, ANY, ANY),
   +            row(3L, appendSequenceNumber, ANY, ANY, ANY),
   +            row(4L, originalSequenceNumber, ANY, ANY, ANY),
   +            row(5L, originalSequenceNumber, ANY, ANY, ANY),
   +            row(6L, originalSequenceNumber, ANY, ANY, ANY),
   +            row(7L, originalSequenceNumber, ANY, ANY, ANY));
   +
   +    assertEquals(
   +        "Carried-over rows must inherit the data sequence",
   +        expectedLineage,
   +        actualLineage);
   +  }
   +
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to