Hi Gang,

Thanks for bringing this up.

I think that if Variant type would have come up earlier (before
iceberg/arrow), its natural place would have been at the file format level
as any other types. The communities started discussing where it should be
placed because now we have different type systems at different places.
Also, the current spec of Variant makes it more or less independent from
the Parquet file format.
However, even at Parquet level, we would need at least an additional
Logical type to help handle Variant type by the systems reading/writing
Parquet.

To summarize my opinion, +1 for having the whole Variant spec in Parquet
format.

Cheers,
Gabor

Gang Wu <ust...@gmail.com> ezt írta (időpont: 2024. aug. 23., P, 11:18):

> Hi,
>
> Apache Iceberg is adding variant type support [1][2] by adopting the
> variant
> spec [3] from Apache Spark. As the proposal is getting mature, both Iceberg
> [4]
> and Spark [5] communities are discussing moving the variant type to Parquet
> repo to avoid divergence. Moving it into Parquet makes the variant spec
> engine
> and table format agnostic, which may encourage wider adoption.
>
> What do people from Parquet community think?
>
> [1] https://lists.apache.org/thread/xnyo1k66dxh0ffpg7j9f04xgos0kwc34
> [2] https://lists.apache.org/thread/xcyytoypgplfr74klg1z2rgjo6k5b0sq
> [3]
>
> https://github.com/apache/spark/blob/d84f1a3575c4125009374521d2f179089ebd71ad/common/variant/README.md
> [4] https://lists.apache.org/thread/hopkr2f0ftoywwt9zo3jxb7n0ob5s5bw
> [5] https://lists.apache.org/thread/0k5oj3mn0049fcxoxm3gx3d7r28gw4rj
>
> Best,
> Gang
>

Reply via email to