funcheetah opened a new pull request, #4654:
URL: https://github.com/apache/iceberg/pull/4654

   ### Summary
   Apache Iceberg does not support non-optional union types (e.g. [“int”, 
“string”]), nor does Apache Spark. This PR enables Iceberg to read non-optional 
union types by converting them into struct representations for Apache ORC 
format.
   
   ### Representation
   The struct representations converted from non-option union types are 
consistent with non-optional union support added in Trino in 
https://github.com/trinodb/trino/pull/3483. 
    
   Deep nested non-optional union types are supported.
   
   #### Examples
   
   Basic
   
   [“int”, “string”] -> struct<tag int, field0 int, field1 string>
   
   Single type
   
   [“int”] -> int
   
   ### Related PRs
   - https://github.com/apache/iceberg/pull/4242
   
   ### TODO
   
   - Add spec for non-optional union types support
   
   - Handle single type union (e.g. [“int”]) as a primitive type
   
   - Support in non-Spark environments (e.g. iceberg-data, flink, hive, etc.)
   
   - Support for schema pruning within a complex union


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to