[GitHub] [iceberg] pvary commented on a change in pull request #1612: Hive: Using Hive schema to create tables and partition specification

GitBox Wed, 25 Nov 2020 00:35:58 -0800


pvary commented on a change in pull request #1612:
URL: https://github.com/apache/iceberg/pull/1612#discussion_r530190583




##########
File path: mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergSerDe.java
##########
@@ -56,10 +61,28 @@ public void initialize(@Nullable Configuration 
configuration, Properties serDePr
     } else if (serDeProperties.get(InputFormatConfig.TABLE_SCHEMA) != null) {
       tableSchema = SchemaParser.fromJson((String) 
serDeProperties.get(InputFormatConfig.TABLE_SCHEMA));
     } else {
-      try {
-        tableSchema = Catalogs.loadTable(configuration, 
serDeProperties).schema();
-      } catch (NoSuchTableException nte) {
-        throw new SerDeException("Please provide an existing table or a valid 
schema", nte);
+      if (Catalogs.hiveCatalog(configuration)) {

Review comment:
       HiveTableOperations converts the table schema to Hive columns / 
StorageDescriptor when any change is committed to the table. This means that 
the Iceberg schema and the Hive schema is always synchronized.
   Since the above synchronization, I think it is better to use the "cached" 
schema instead of loading the table again and again. This might change when we 
clean up Timestamps / UUIDs since the mapping is not 1-on-1 there, but I would 
leave something for that new PR too 😄 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] pvary commented on a change in pull request #1612: Hive: Using Hive schema to create tables and partition specification

Reply via email to