Hi Gang, Thanks for bringing this up.
I think that if Variant type would have come up earlier (before iceberg/arrow), its natural place would have been at the file format level as any other types. The communities started discussing where it should be placed because now we have different type systems at different places. Also, the current spec of Variant makes it more or less independent from the Parquet file format. However, even at Parquet level, we would need at least an additional Logical type to help handle Variant type by the systems reading/writing Parquet. To summarize my opinion, +1 for having the whole Variant spec in Parquet format. Cheers, Gabor Gang Wu <ust...@gmail.com> ezt írta (időpont: 2024. aug. 23., P, 11:18): > Hi, > > Apache Iceberg is adding variant type support [1][2] by adopting the > variant > spec [3] from Apache Spark. As the proposal is getting mature, both Iceberg > [4] > and Spark [5] communities are discussing moving the variant type to Parquet > repo to avoid divergence. Moving it into Parquet makes the variant spec > engine > and table format agnostic, which may encourage wider adoption. > > What do people from Parquet community think? > > [1] https://lists.apache.org/thread/xnyo1k66dxh0ffpg7j9f04xgos0kwc34 > [2] https://lists.apache.org/thread/xcyytoypgplfr74klg1z2rgjo6k5b0sq > [3] > > https://github.com/apache/spark/blob/d84f1a3575c4125009374521d2f179089ebd71ad/common/variant/README.md > [4] https://lists.apache.org/thread/hopkr2f0ftoywwt9zo3jxb7n0ob5s5bw > [5] https://lists.apache.org/thread/0k5oj3mn0049fcxoxm3gx3d7r28gw4rj > > Best, > Gang >