openinx commented on a change in pull request #2275:
URL: https://github.com/apache/iceberg/pull/2275#discussion_r608315319
##########
File path: api/src/main/java/org/apache/iceberg/Snapshot.java
##########
@@ -126,4 +126,13 @@
* @return the location of the manifest list for this Snapshot
*/
String manifestListLocation();
+
+ /**
+ * Return the id of the schema used when this snapshot was created, or null
if this information is not available.
+ *
+ * @return schema id associated with this snapshot
+ */
+ default Integer schemaId() {
+ return null;
Review comment:
What's the case that the information will be `null` ? And if it's null,
then how could people read the correct schema for the snapshot ?
##########
File path: api/src/main/java/org/apache/iceberg/Snapshot.java
##########
@@ -126,4 +126,13 @@
* @return the location of the manifest list for this Snapshot
*/
String manifestListLocation();
+
+ /**
+ * Return the id of the schema used when this snapshot was created, or null
if this information is not available.
+ *
+ * @return schema id associated with this snapshot
+ */
+ default Integer schemaId() {
+ return null;
Review comment:
Okay, I think you mean if people read the old metadata, its schema id
from snapshots will be `null`.
##########
File path: api/src/test/java/org/apache/iceberg/TestHelpers.java
##########
@@ -120,6 +120,22 @@ public static void assertSerializedAndLoadedMetadata(Table
expected, Table actua
Assert.assertEquals("History must match", expected.history(),
actual.history());
}
+ public static void assertSameSchemaMap(Map<Integer, Schema> map1,
Map<Integer, Schema> map2) {
+ if (map1.size() != map2.size()) {
+ Assert.fail("Should have same number of schemas in both maps");
+ }
+
+ map1.forEach((schemaId, schema1) -> {
+ Schema schema2 = map2.get(schemaId);
+ Assert.assertNotNull(String.format("Schema ID %s should exist in both
map", schemaId), schema2);
Review comment:
Nit: I think we could make this error message more clear here because
the given schemaId is definitely not found in the map2 if the assert failure
happens.
##########
File path: core/src/main/java/org/apache/iceberg/SerializableTable.java
##########
@@ -147,6 +147,7 @@ public String location() {
return properties;
}
+ // Note that schema parsed from string does not contain the correct schema
ID.
Review comment:
What does this mean ?
##########
File path: core/src/main/java/org/apache/iceberg/BaseSnapshot.java
##########
@@ -87,7 +91,17 @@
String operation,
Map<String, String> summary,
List<ManifestFile> dataManifests) {
- this(io, INITIAL_SEQUENCE_NUMBER, snapshotId, parentId, timestampMillis,
operation, summary, null);
+ this(io, snapshotId, parentId, timestampMillis, operation, summary, null,
dataManifests);
+ }
+
+ BaseSnapshot(FileIO io,
+ long snapshotId,
+ Long parentId,
+ long timestampMillis,
+ String operation,
+ Map<String, String> summary,
+ Integer schemaId, List<ManifestFile> dataManifests) {
Review comment:
Nit: Let's use two separate lines for those two constructor variables,
that's more clear.
##########
File path: core/src/main/java/org/apache/iceberg/BaseSnapshot.java
##########
@@ -78,6 +81,7 @@
this.operation = operation;
this.summary = summary;
this.manifestListLocation = manifestList;
+ this.schemaId = schemaId;
Review comment:
Nit: Please use the same order to define constructor variables and
member assignments, which makes the code more readable.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]