[GitHub] [iceberg] rdblue commented on a change in pull request #2465: Core: add row identifier to schema

GitBox Sat, 17 Apr 2021 16:48:11 -0700


rdblue commented on a change in pull request #2465:
URL: https://github.com/apache/iceberg/pull/2465#discussion_r615317981




##########
File path: core/src/main/java/org/apache/iceberg/SchemaUpdate.java
##########
@@ -51,6 +53,7 @@
   private final TableMetadata base;
   private final Schema schema;
   private final Map<Integer, Integer> idToParent;
+  private final Set<String> identifierFieldNames;

Review comment:
       Why did you choose to use names instead of ids here? I think IDs would 
be a little simpler because you wouldn't need to translate in more cases. For 
example, rename wouldn't need to modify the identifier field ID set because it 
doesn't change the field IDs. You'd also only need to copy the set of field IDs 
returned by the current schema in the constructor.
   
   I was looking for a reason why you might have chosen to use names, but I 
think that it is straightforward to use IDs instead of names in all cases. For 
example, the check in `deleteColumn` validates the name passed in directly, 
which seems easier; but the name has already been looked up in the schema so we 
have a field and could validate `identifierFieldIds.contains(field.fieldId())` 
instead like the other checks do.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] rdblue commented on a change in pull request #2465: Core: add row identifier to schema

Reply via email to