[GitHub] [iceberg] pvary commented on a change in pull request #3912: Hive: Support 'identifier-field-ids' when creating table in hive

GitBox Fri, 21 Jan 2022 21:44:57 -0800


pvary commented on a change in pull request #3912:
URL: https://github.com/apache/iceberg/pull/3912#discussion_r790107120




##########
File path: 
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveSchemaUtil.java
##########
@@ -75,7 +84,36 @@ public static Schema convert(List<FieldSchema> fieldSchemas, 
boolean autoConvert
       typeInfos.add(TypeInfoUtils.getTypeInfoFromTypeString(col.getType()));
       comments.add(col.getComment());
     }
-    return HiveSchemaConverter.convert(names, typeInfos, comments, 
autoConvert);
+    Schema schema = HiveSchemaConverter.convert(names, typeInfos, comments, 
autoConvert);
+    return rebuildSchemaWithIdentifierFieldIds(schema, identifierFieldNames);
+  }
+
+  /**
+   * Rebuild a schema with given schema and identifierFieldNames
+   * @param schema The origin schema.
+   * @param identifierFieldNames The identifierFieldNames.
+   * @return New schema with IdentifierFieldIds.
+   */
+  public static Schema rebuildSchemaWithIdentifierFieldIds(Schema schema, 
Set<String> identifierFieldNames) {
+    Map<Integer, Integer> indexParents = 
TypeUtil.indexParents(schema.asStruct());
+    Set<Integer> identifierFieldIds = identifierFieldNames.stream()
+        .map(name -> {
+          Types.NestedField field = schema.findField(name);

Review comment:
       i think it would be better to simply just check the first level fields 
instead of a recursive check, and then throwing an exception if it is a nested 
field. Also it might be possible to have a nested field with the same name as a 
column.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] pvary commented on a change in pull request #3912: Hive: Support 'identifier-field-ids' when creating table in hive

Reply via email to