gaborkaszab commented on code in PR #16408:
URL: https://github.com/apache/iceberg/pull/16408#discussion_r3324856726


##########
core/src/main/java/org/apache/iceberg/TrackingStruct.java:
##########
@@ -59,6 +59,11 @@ class TrackingStruct extends SupportsIndexProjection 
implements Tracking, Serial
     super(BASE_TYPE, type);
   }
 
+  /** Constructor for Java serialization. */
+  TrackingStruct() {

Review Comment:
   nit: Is this relevant for this PR? Seems independent to a follow up "Java 
serialization for V4 metadata structs" PR



##########
core/src/main/java/org/apache/iceberg/TrackingStruct.java:
##########
@@ -249,95 +254,43 @@ protected <T> void internalSet(int pos, T value) {
     }
   }
 
-  static Builder builder() {
-    return new Builder();
+  /** Creates a builder for a newly added file in the given snapshot. */
+  static TrackingBuilder added(long snapshotId) {
+    return new TrackingBuilder(snapshotId);
+  }
+
+  /**
+   * Creates a builder for a tracking row derived from {@code source} at the 
current snapshot.
+   *
+   * <p>Without MODIFIED status, this produces an EXISTING row. Once MODIFIED 
lands, the status will

Review Comment:
   nit: most part of these comments describe what will happen once we introduce 
MODIFIED. I don't think it's useful here, a TODO is probably enough



##########
core/src/main/java/org/apache/iceberg/TrackingStruct.java:
##########
@@ -79,7 +84,7 @@ private TrackingStruct(TrackingStruct toCopy) {
     this.manifestPos = toCopy.manifestPos;
   }
 
-  private TrackingStruct(
+  TrackingStruct(

Review Comment:
   nit: giving more visibility to the constructor was needed because the 
builder moved to another file, right? With this don't we give too much 
flexibility, for instance one can create a class in the same package and create 
TrackingStruct with uncontrolled inputs?



##########
core/src/main/java/org/apache/iceberg/TrackingStruct.java:
##########
@@ -249,95 +254,43 @@ protected <T> void internalSet(int pos, T value) {
     }
   }
 
-  static Builder builder() {
-    return new Builder();
+  /** Creates a builder for a newly added file in the given snapshot. */
+  static TrackingBuilder added(long snapshotId) {
+    return new TrackingBuilder(snapshotId);
+  }
+
+  /**
+   * Creates a builder for a tracking row derived from {@code source} at the 
current snapshot.
+   *
+   * <p>Without MODIFIED status, this produces an EXISTING row. Once MODIFIED 
lands, the status will
+   * be auto-derived from the source, the snapshot, and which mutation methods 
are called.
+   */
+  // TODO: when MODIFIED is added, derive status from source + 
currentSnapshotId + mutations.
+  static TrackingBuilder builder(Tracking source, long currentSnapshotId) {

Review Comment:
   For creating an EXISTING entry, do we have to change anything on the source? 
I think, for simplicity we could introduce an API ad `Tracking 
Tracking.asExisting()` and not return a `TrackingBuilder` in the 
implementation, separating the creation of EXISTING and MODIFIED entries.
   Then, if I don't miss anything we could dedicate `TrackingBuilder` to create 
MODIFIED entries (with requiring either the DVs or the column files later to be 
changed). Then maybe this function could be called `modifiedBuilder`.
   
   WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to