[GitHub] [iceberg] openinx commented on a change in pull request #2275: Core: add schema id to snapshot

GitBox Tue, 06 Apr 2021 21:10:22 -0700


openinx commented on a change in pull request #2275:
URL: https://github.com/apache/iceberg/pull/2275#discussion_r608315319




##########
File path: api/src/main/java/org/apache/iceberg/Snapshot.java
##########
@@ -126,4 +126,13 @@
    * @return the location of the manifest list for this Snapshot
    */
   String manifestListLocation();
+
+  /**
+   * Return the id of the schema used when this snapshot was created, or null 
if this information is not available.
+   *
+   * @return schema id associated with this snapshot
+   */
+  default Integer schemaId() {
+    return null;

Review comment:
       What's the case that the information will be `null` ?  And if it's null, 
then how could people read the correct schema for the snapshot ?

##########
File path: api/src/main/java/org/apache/iceberg/Snapshot.java
##########
@@ -126,4 +126,13 @@
    * @return the location of the manifest list for this Snapshot
    */
   String manifestListLocation();
+
+  /**
+   * Return the id of the schema used when this snapshot was created, or null 
if this information is not available.
+   *
+   * @return schema id associated with this snapshot
+   */
+  default Integer schemaId() {
+    return null;

Review comment:
       Okay,  I think you mean if people read the old metadata,  its schema id 
from snapshots will be `null`.

##########
File path: api/src/test/java/org/apache/iceberg/TestHelpers.java
##########
@@ -120,6 +120,22 @@ public static void assertSerializedAndLoadedMetadata(Table 
expected, Table actua
     Assert.assertEquals("History must match", expected.history(), 
actual.history());
   }
 
+  public static void assertSameSchemaMap(Map<Integer, Schema> map1, 
Map<Integer, Schema> map2) {
+    if (map1.size() != map2.size()) {
+      Assert.fail("Should have same number of schemas in both maps");
+    }
+
+    map1.forEach((schemaId, schema1) -> {
+      Schema schema2 = map2.get(schemaId);
+      Assert.assertNotNull(String.format("Schema ID %s should exist in both 
map", schemaId), schema2);

Review comment:
       Nit:  I think we could make this error message  more clear here because 
the given schemaId is definitely not found in the map2 if the assert failure 
happens.

##########
File path: core/src/main/java/org/apache/iceberg/SerializableTable.java
##########
@@ -147,6 +147,7 @@ public String location() {
     return properties;
   }
 
+  // Note that schema parsed from string does not contain the correct schema 
ID.

Review comment:
       What does this mean ? 

##########
File path: core/src/main/java/org/apache/iceberg/BaseSnapshot.java
##########
@@ -87,7 +91,17 @@
                String operation,
                Map<String, String> summary,
                List<ManifestFile> dataManifests) {
-    this(io, INITIAL_SEQUENCE_NUMBER, snapshotId, parentId, timestampMillis, 
operation, summary, null);
+    this(io, snapshotId, parentId, timestampMillis, operation, summary, null, 
dataManifests);
+  }
+
+  BaseSnapshot(FileIO io,
+               long snapshotId,
+               Long parentId,
+               long timestampMillis,
+               String operation,
+               Map<String, String> summary,
+               Integer schemaId, List<ManifestFile> dataManifests) {

Review comment:
       Nit:  Let's use two separate lines for those two constructor variables, 
that's more clear. 

##########
File path: core/src/main/java/org/apache/iceberg/BaseSnapshot.java
##########
@@ -78,6 +81,7 @@
     this.operation = operation;
     this.summary = summary;
     this.manifestListLocation = manifestList;
+    this.schemaId = schemaId;

Review comment:
       Nit:   Please use the same order to define constructor variables and 
member assignments, which makes the code more readable.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] openinx commented on a change in pull request #2275: Core: add schema id to snapshot

Reply via email to