stevenzwu commented on code in PR #16689:
URL: https://github.com/apache/iceberg/pull/16689#discussion_r3364254819


##########
core/src/main/java/org/apache/iceberg/Tracking.java:
##########
@@ -28,13 +28,13 @@ interface Tracking {
           0,
           "status",
           Types.IntegerType.get(),
-          "Entry status: 0=existing, 1=added, 2=deleted, 3=replaced");
+          "Entry status: 0=existing, 1=added, 2=deleted, 3=replaced, 
4=modified");
   Types.NestedField SNAPSHOT_ID =
       Types.NestedField.optional(
           1,
           "snapshot_id",
           Types.LongType.get(),
-          "Snapshot ID where the file was added or deleted");
+          "Snapshot ID where the file was added, deleted, replaced, or 
modified");

Review Comment:
   `TrackingBuilder.terminal()` writes `newSnapshotId` for both DELETED and 
REPLACED today, so the doc matches the current code.
   
   To me, `deleted` and `replaced` entries should behave the same. replaced is 
just a special flavor of deleted. Since v1-v3 already modify the snapshot ID 
for a deleted entry, it seems reasonable to maintain the same behavior.
   
   Also for change detection, it is probably better to update the snapshot id 
in tracking for deleted or replaced entries. If the tracking snapshot_id 
doesn't match the current snapshot id, the entries should be ignored for change 
detection. This is important because the current spec is silent on lifetime of 
deleted entries.



##########
core/src/main/java/org/apache/iceberg/EntryStatus.java:
##########
@@ -23,8 +23,13 @@ enum EntryStatus {
   EXISTING(0),
   ADDED(1),
   DELETED(2),
-  /** Indicates an entry that has been replaced by a column update or DV 
change. Added in v4. */
-  REPLACED(3);
+  /**
+   * Non-live entry recording that a prior file version was superseded by 
another live entry. Added
+   * in v4.
+   */
+  REPLACED(3),
+  /** Live entry recording that the file was modified in this snapshot. Added 
in v4. */

Review Comment:
   Mirror the framing rdblue suggested for REPLACED above — MODIFIED is its 
live counterpart.
   
   ```suggestion
     /** The new (live) state of an entry that has been modified. Paired with 
REPLACED. Added in v4. */
   ```



##########
core/src/main/java/org/apache/iceberg/EntryStatus.java:
##########
@@ -23,8 +23,13 @@ enum EntryStatus {
   EXISTING(0),
   ADDED(1),
   DELETED(2),
-  /** Indicates an entry that has been replaced by a column update or DV 
change. Added in v4. */
-  REPLACED(3);
+  /**
+   * Non-live entry recording that a prior file version was superseded by 
another live entry. Added
+   * in v4.

Review Comment:
   Is `old` better than `starting` in this conext?
   
   > The old (replaced) state of en entry that is modified. Paired with 
MODIFIED. Added in v4.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to