gaborkaszab commented on code in PR #16689:
URL: https://github.com/apache/iceberg/pull/16689#discussion_r3362445989
##########
core/src/test/java/org/apache/iceberg/TestTrackingStruct.java:
##########
@@ -433,6 +453,81 @@ void testExistingToTerminalTransitions() {
assertThat(replaced.snapshotId()).isEqualTo(999L);
}
+ private static Stream<Arguments> deriveStatusCases() {
+ long sameSnap = 42L;
Review Comment:
nit: I think it's worth spell out the names: `sameSnapshot`, `laterSnapshot`
##########
core/src/test/java/org/apache/iceberg/TestTrackingStruct.java:
##########
@@ -493,7 +588,7 @@ void testAddedWithDvSnapshotIdJavaSerializationRoundTrip()
}
@Test
- void testExistingWithManifestDVPositionsJavaSerializationRoundTrip()
+ void testModifiedWithManifestDVPositionsJavaSerializationRoundTrip()
Review Comment:
If I'm not mistaken, serde doesn't go through the builder, so in practice
there is no point of having separate tests based on status. I see one for added
and one for existing -> modified here. Maybe have a single one with all the
possible fields called `testJavaSerializationRoundTrip`?
Same for the Kryo versions.
##########
core/src/main/java/org/apache/iceberg/TrackingBuilder.java:
##########
@@ -143,6 +149,29 @@ Tracking build() {
replacedPositions);
}
+ /** Derives the output status from the source, the snapshot, and any
mutations. */
+ private EntryStatus deriveStatus() {
+ if (source == null) {
+ return EntryStatus.ADDED;
+ }
+
+ boolean sameSnapshot = source.snapshotId() != null && source.snapshotId()
== newSnapshotId;
Review Comment:
Reading this I have the impression that this PR contains 2 different
functionalities:
1) the "sameSnapshot" case for ADDED status
2) Introduction of MODIFIED and all the transitions to/from
No strong opinion, but I usually prefer keeping a single purpose for my PRs.
LMK WDYT
##########
core/src/main/java/org/apache/iceberg/EntryStatus.java:
##########
@@ -23,8 +23,15 @@ enum EntryStatus {
EXISTING(0),
ADDED(1),
DELETED(2),
- /** Indicates an entry that has been replaced by a column update or DV
change. Added in v4. */
- REPLACED(3);
+ /**
+ * Non-live entry recording that a prior file version was superseded by a
column update or DV
Review Comment:
nit: not sure we should express that it's column update or DV that causes
this. Maybe simply "Non-live entry recording that a prior file version was
superseded by another live entry"? This adds flexibility in the future without
coming back to change the comment.
Same below for MODIFIED
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]