[GitHub] [iceberg] rdblue commented on a change in pull request #2465: Core: add row identifier to schema

GitBox Sat, 17 Apr 2021 16:41:23 -0700


rdblue commented on a change in pull request #2465:
URL: https://github.com/apache/iceberg/pull/2465#discussion_r615317429




##########
File path: core/src/main/java/org/apache/iceberg/SchemaUpdate.java
##########
@@ -317,6 +333,38 @@ public UpdateSchema unionByNameWith(Schema newSchema) {
     return this;
   }
 
+  @Override
+  public UpdateSchema addIdentifierField(String name) {
+    Types.NestedField field = schema.findField(name);
+    if (field == null) {
+      field = adds.get(TABLE_ROOT_ID).stream()
+          .filter(f -> f.name().equals(name))
+          .findAny().orElse(null);

Review comment:
       This assumes that the field is not a newly added field within a struct.
   
   I think that this should use the same strategy that we use for `addColumn`. 
There are two variations of `addColumn`, one that accepts a parent name and a 
field name, and one that accepts only a field name. If the latter is called 
with a name that contains `.`, then an exception is thrown because the 
reference is ambiguous. For example, `a.b` could be a top-level field named 
`a.b` or could be field `b` nested within field `a`.
   
   By structuring the methods that way, we always know the parent field and can 
use that instead of assuming `TABLE_ROOT_ID` here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] rdblue commented on a change in pull request #2465: Core: add row identifier to schema

Reply via email to