rdblue commented on a change in pull request #2465:
URL: https://github.com/apache/iceberg/pull/2465#discussion_r620692694
##########
File path: api/src/main/java/org/apache/iceberg/Schema.java
##########
@@ -158,6 +186,33 @@ public StructType asStruct() {
return struct.fields();
}
+ /**
+ * The set of identifier field IDs.
+ * <p>
+ * Identifier is a concept similar to primary key in a relational database
system.
+ * A row should be unique in a table based on the values of the identifier
fields.
+ * However, unlike a primary key, Iceberg identifier differs in the
following ways:
+ * <ul>
+ * <li>Iceberg does not enforce the uniqueness of a row based on this
identifier information.
+ * It is used for operations like upsert to define the default upsert
key.</li>
+ * <li>NULL can be used as value of an identifier field. Iceberg ensures
null-safe equality check.</li>
+ * <li>A nested field in a struct can be used as an identifier. For example,
if there is a "last_name" field
+ * inside a "user" struct in a schema, field "user.last_name" can be set as
a part of the identifier field.</li>
+ * </ul>
+ * <p>
+ * A field can be used as a part of the identifier only if:
+ * <ul>
+ * <li>its type is primitive</li>
+ * <li>it exists in the current schema, or have been added in this
update</li>
Review comment:
This point doesn't make much sense here because there is no "this
update" and the "current schema" is this schema. You could say that these field
IDs are guaranteed to be non-repeated primitive fields in this schema.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]