rdblue commented on code in PR #16408:
URL: https://github.com/apache/iceberg/pull/16408#discussion_r3284957410


##########
core/src/main/java/org/apache/iceberg/TrackingStruct.java:
##########
@@ -268,67 +286,67 @@ public String toString() {
   }
 
   static class Builder {
-    private EntryStatus status = null;
-    private Long snapshotId = null;
-    private Long dataSequenceNumber = null;
-    private Long fileSequenceNumber = null;
-    private Long dvSnapshotId = null;
-    private Long firstRowId = null;
-    private byte[] deletedPositions = null;
-    private byte[] replacedPositions = null;
-
-    Builder status(EntryStatus entryStatus) {
-      this.status = entryStatus;
-      return this;
-    }
-
-    Builder snapshotId(Long id) {
-      this.snapshotId = id;
-      return this;
-    }
-
-    Builder dataSequenceNumber(Long sequenceNumber) {
-      this.dataSequenceNumber = sequenceNumber;
-      return this;
+    private final EntryStatus status;
+    private final Long snapshotId;
+    private final Long dataSequenceNumber;
+    private final Long fileSequenceNumber;
+    private final Long firstRowId;
+    private Long dvSnapshotId;
+    private byte[] deletedPositions;
+    private byte[] replacedPositions;
+
+    private Builder(EntryStatus status, long snapshotId) {
+      this.status = status;
+      this.snapshotId = snapshotId;
+      this.dataSequenceNumber = null;
+      this.fileSequenceNumber = null;
+      this.firstRowId = null;
     }
 
-    Builder fileSequenceNumber(Long sequenceNumber) {
-      this.fileSequenceNumber = sequenceNumber;
-      return this;
+    private Builder(EntryStatus status, Tracking source, Long snapshotId) {
+      Preconditions.checkArgument(
+          source.dataSequenceNumber() != null,
+          "Source tracking has no data sequence number; cannot mark as %s",
+          status);
+      Preconditions.checkArgument(
+          source.fileSequenceNumber() != null,
+          "Source tracking has no file sequence number; cannot mark as %s",
+          status);
+      this.status = status;
+      this.snapshotId = snapshotId;
+      this.dataSequenceNumber = source.dataSequenceNumber();
+      this.fileSequenceNumber = source.fileSequenceNumber();
+      this.firstRowId = source.firstRowId();
+      this.dvSnapshotId = source.dvSnapshotId();
+      ByteBuffer sourceDeleted = source.deletedPositions();
+      this.deletedPositions = sourceDeleted != null ? 
ByteBuffers.toByteArray(sourceDeleted) : null;
+      ByteBuffer sourceReplaced = source.replacedPositions();
+      this.replacedPositions =
+          sourceReplaced != null ? ByteBuffers.toByteArray(sourceReplaced) : 
null;
     }
 
-    Builder dvSnapshotId(Long id) {
+    Builder dvSnapshotId(long id) {
+      Preconditions.checkState(
+          status != EntryStatus.DELETED, "Cannot set dv snapshot ID on a 
DELETED entry");
       this.dvSnapshotId = id;
       return this;
     }
 
-    Builder firstRowId(Long rowId) {
-      this.firstRowId = rowId;
-      return this;
-    }
-
     Builder deletedPositions(ByteBuffer positions) {
+      Preconditions.checkState(
+          status != EntryStatus.DELETED, "Cannot set deleted positions on a 
DELETED entry");

Review Comment:
   Deleted or replaced positions are only valid for manifest files that have 
been changed. That means that `ADDED` is also not allowed here. When the status 
is `DELETED` or `REPLACED`, the tracking does not need to include these bitmaps 
(even if the copied entry did) so I would also not allow them for those 
statuses.
   
   If the status is `EXISTING` then it must not have changed, so it also 
doesn't allow these bitmaps. This one is slightly different though, because we 
could choose to automatically detect and set `EXISTING` or `MODIFIED` based on 
changes to the status. If a DV is updated by calling `dvSnapshotId` then we 
could change the status to `MODIFIED`, or if these MDV methods are called we 
could also update the status to `MODIFIED`.
   
   Lastly, this is used only for manifests and `dvSnapshotId` is used only for 
data files, so we should ensure that they are not set on the same builder.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to