amogh-jahagirdar commented on code in PR #16689:
URL: https://github.com/apache/iceberg/pull/16689#discussion_r3392442217
##########
core/src/main/java/org/apache/iceberg/TrackingBuilder.java:
##########
@@ -99,35 +98,54 @@ private TrackingBuilder(Tracking source, long
newSnapshotId) {
this.replacedPositions = null;
}
- /** Indicates that the DV has been updated for the new Tracking. */
+ /**
+ * Records that the file's DV was updated by this commit, advancing {@code
dvSnapshotId} to the
+ * commit snapshot. An EXISTING entry transitions to MODIFIED; an ADDED
entry stays ADDED (a file
+ * added and given a DV in the same commit).
+ */
TrackingBuilder dvUpdated() {
- // DV applies to data files; deleted/replaced positions apply to manifest
files
Preconditions.checkState(
deletedPositions == null && replacedPositions == null,
"Cannot mark DV updated on a manifest entry (deleted/replaced
positions are set)");
this.dvSnapshotId = newSnapshotId;
+ if (status == EntryStatus.EXISTING) {
+ this.status = EntryStatus.MODIFIED;
+ }
+
return this;
}
+ /**
+ * Records the manifest-leaf positions deleted by this commit, advancing
{@code dvSnapshotId} to
+ * the commit snapshot and transitioning an EXISTING entry to MODIFIED.
Cannot be called on an
+ * ADDED entry.
+ */
TrackingBuilder deletedPositions(ByteBuffer positions) {
Preconditions.checkState(
- status == EntryStatus.EXISTING, "Cannot set deleted positions on %s
entry", status);
- // DV applies to data files; deleted positions apply to manifest files
- Preconditions.checkState(
- dvSnapshotId == null,
- "Cannot set deleted positions on a data file entry (DV snapshot ID is
set)");
+ status != EntryStatus.ADDED, "Cannot set deleted positions on ADDED
entry");
this.deletedPositions = ByteBuffers.toByteArray(positions);
+ this.dvSnapshotId = newSnapshotId;
+ if (status == EntryStatus.EXISTING) {
+ this.status = EntryStatus.MODIFIED;
+ }
+
return this;
}
+ /**
+ * Records the manifest-leaf positions replaced by this commit, advancing
{@code dvSnapshotId} to
Review Comment:
Records the leaf manifest positions?
##########
core/src/test/java/org/apache/iceberg/TestTrackingStruct.java:
##########
@@ -433,6 +429,48 @@ void testExistingToTerminalTransitions() {
assertThat(replaced.snapshotId()).isEqualTo(999L);
}
+ @Test
+ void testExistingPreservesSourceSnapshotId() {
+ Tracking source = sourceTracking();
+ Tracking existing = TrackingBuilder.from(source, 999L).build();
+ assertThat(existing.status()).isEqualTo(EntryStatus.EXISTING);
+
assertThat(existing.snapshotId()).isEqualTo(source.snapshotId()).isNotEqualTo(999L);
+ }
+
+ @Test
+ void testCarryForwardFromModifiedSourceChangesToExisting() {
+ // A MODIFIED entry from a prior commit carried forward without mutation;
status becomes
+ // EXISTING.
Review Comment:
i'm not sure inline comments like this are really helpful when the test name
itself says what it's going to test
##########
core/src/main/java/org/apache/iceberg/EntryStatus.java:
##########
@@ -23,8 +23,12 @@ enum EntryStatus {
EXISTING(0),
ADDED(1),
DELETED(2),
- /** Indicates an entry that has been replaced by a column update or DV
change. Added in v4. */
- REPLACED(3);
+ /**
+ * The old (replaced) state of an entry that has been modified. Paired with
MODIFIED. Added in v4.
Review Comment:
The current comment state seems right to me and implicitly covers the case
of leaf manifests that @gaborkaszab was talking about. REPLACED is paired with
MODIFIED, but MODIFIED is not neccessarily paired with REPLACED (don't need to
explicitly say that last part).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]