rdblue commented on a change in pull request #227: ORC column map fix
URL: https://github.com/apache/incubator-iceberg/pull/227#discussion_r300061844
 
 

 ##########
 File path: orc/src/main/java/org/apache/iceberg/orc/TypeConversion.java
 ##########
 @@ -159,21 +163,45 @@ Type convertOrcToType(TypeDescription schema, 
ColumnIdMap columnIds) {
         for (int c = 0; c < fieldNames.size(); ++c) {
           String name = fieldNames.get(c);
           TypeDescription type = fieldTypes.get(c);
-          fields.add(Types.NestedField.optional(columnIds.get(type), name,
-              convertOrcToType(type, columnIds)));
+          IcebergColumn col = columnIds.get(type);
+          final Types.NestedField field;
+          if (col != null) {
+            field = col.isRequired() ?
+                Types.NestedField.required(col.id(), name, 
convertOrcToType(type, columnIds)) :
+                Types.NestedField.optional(col.id(), name, 
convertOrcToType(type, columnIds));
+          } else {
+            field = Types.NestedField.optional(type.getId(), name, 
convertOrcToType(type, columnIds));
 
 Review comment:
   For Parquet, the fallback logic is separate, which I think is a good idea.
   
   The problem with falling back if a single type is missing is that you could 
have collisions. Iceberg should either reassign all column IDs or use the 
mappings, not both. For Parquet, we have a `hasIDs` method that checks whether 
all columns have an ID and only converts if that's the case. If not, then there 
is a fallback to assign IDs by top-level position starting from 1 (instead of 0 
as this would do). That's because table ID assignment starts at 1.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to