kbuci commented on issue #18711:
URL: https://github.com/apache/hudi/issues/18711#issuecomment-4411765009

   Yes, I'd prefer that as well - passing around the Hoodie schema as the 
"authoritative source" to infer HUDI logical types, since thats more 
understandable/clean and any seems to be the precedent in HUDI spark. My only 
concern was figuring out if that breaks any public APIs, but I can assess that 
as I create the PR, and if so we can make sure to land any such PR in 1.3+ (and 
not in 1.2). 
   
   I'm still famializing myself with other table formats, but based on a very 
rough/initial search it seems other table formats might also be leaning towards 
A approach in practice?
   
   ## Iceberg
   
   - **Iceberg Spec (v3, includes Variant type definition)**: 
https://iceberg.apache.org/spec
   - **Iceberg Schemas doc (field IDs, type system)**: 
https://iceberg.apache.org/docs/latest/schemas
   - **PR: Add variant type support to ParquetTypeVisitor**: 
https://github.com/apache/iceberg/pull/14588
   - **PR: Implement Variant Parquet readers**: 
https://github.com/apache/iceberg/pull/12139
   - **PR: Spec — add variant type**: 
https://github.com/apache/iceberg/pull/10831
   - **Snowflake blog: Iceberg v3 Variant Type**: 
https://www.snowflake.com/en/engineering-blog/apache-iceberg-v3-variant-type/
   
   ## Delta Lake
   
   - **Delta Variant Type RFC**: 
https://github.com/delta-io/delta/blob/master/protocol_rfcs/accepted/variant-type.md
   - **Delta Protocol (v4.2.0, schema in transaction log)**: 
https://github.com/delta-io/delta/blob/v4.2.0/PROTOCOL.md
   - **PR: Add VariantType support in Spark schema conversion**: 
https://github.com/delta-io/delta/pull/6164
   - **PR: Kernel-level variant schema deserialization**: 
https://github.com/delta-io/delta/pull/3464
   - **Delta Variant Shredding RFC**: 
https://github.com/delta-io/delta/blob/master/protocol_rfcs/variant-shredding.md
   
   ## Paimon
   
   - **PIP-40: Introduce a new Vector data type**: 
https://cwiki.apache.org/confluence/display/PAIMON/PIP-40%3A+Introduce+a+new+Vector+data+type
   - **Issue: Introduce VecType**: https://github.com/apache/paimon/issues/7011
   - **PR: Add Flink support for VectorType**: 
https://github.com/apache/paimon/pull/7238
   - **Paimon FileFormat spec (Parquet type mapping)**: 
https://paimon.apache.org/docs/1.4/concepts/spec/fileformat/
   - **Paimon ParquetSchemaConverter API**: 
https://paimon.apache.org/docs/0.9/api/java/org/apache/paimon/format/parquet/ParquetSchemaConverter.html
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to