[GitHub] [iceberg] pvary commented on a change in pull request #1612: Hive: Using Hive schema to create tables and partition specification

GitBox Thu, 19 Nov 2020 01:05:23 -0800


pvary commented on a change in pull request #1612:
URL: https://github.com/apache/iceberg/pull/1612#discussion_r526697929




##########
File path: mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergSerDe.java
##########
@@ -56,10 +62,22 @@ public void initialize(@Nullable Configuration 
configuration, Properties serDePr
     } else if (serDeProperties.get(InputFormatConfig.TABLE_SCHEMA) != null) {
       tableSchema = SchemaParser.fromJson((String) 
serDeProperties.get(InputFormatConfig.TABLE_SCHEMA));
     } else {
-      try {
-        tableSchema = Catalogs.loadTable(configuration, 
serDeProperties).schema();
-      } catch (NoSuchTableException nte) {
-        throw new SerDeException("Please provide an existing table or a valid 
schema", nte);
+      // Read the configuration parameters
+      String columnNames = 
serDeProperties.getProperty(serdeConstants.LIST_COLUMNS);
+      String columnTypes = 
serDeProperties.getProperty(serdeConstants.LIST_COLUMN_TYPES);

Review comment:
       Hive does a lot of magic here:
   - First creates a SerDe to get the columns - in this case these properties 
are empty
   - Uses the objectInspectors from the SerDe to update column specification of 
the table.
   - Then when it does the query planning it again calls this part, but the 
properties are already set based on the result of the previous call
   
   With that sad this theoretically solves the schema evolution problem, but I 
am not convinced that we caught all of the nuances. So further down the road we 
plan to test this out as well and file the new PRs as required.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] pvary commented on a change in pull request #1612: Hive: Using Hive schema to create tables and partition specification

Reply via email to