> Do we have volunteers to implement it in Parquet-java + another OSS implementation?
I don't think that should be a blocker for incorporating. I'd be inclined to do something like mark it as experimental or similar in the spec until the reference impls are done. On Fri, Aug 23, 2024 at 10:32 AM Micah Kornfield <emkornfi...@gmail.com> wrote: > I'm in favor of this, but wondering on the logistics. Do we have > volunteers to implement it in Parquet-java + another OSS implementation or > are we going to bypass this requirement for now? > > Thanks, > Micah > > On Friday, August 23, 2024, Ryan Blue <b...@databricks.com.invalid> wrote: > > > +1 > > > > On Fri, Aug 23, 2024 at 12:30 PM Jacques Nadeau <jacq...@apache.org> > > wrote: > > > > > +1 > > > > > > On Fri, Aug 23, 2024 at 8:51 AM Nong Li <non...@gmail.com> wrote: > > > > > > > +1. > > > > > > > > On Fri, Aug 23, 2024 at 12:57 PM Jan Finis <jpfi...@gmail.com> > wrote: > > > > > > > > > I would also appreciate having native Variant support in Parquet. > > > > > > > > > > Am Fr., 23. Aug. 2024 um 12:10 Uhr schrieb Fokko Driesprong < > > > > > fo...@apache.org>: > > > > > > > > > > > Hey Gang, > > > > > > > > > > > > Thanks for raising this. +1 from my end. > > > > > > > > > > > > For context, as Gang mentioned, when proposing to add a Variant > > Type > > > to > > > > > > Iceberg <https://github.com/apache/iceberg/issues/10392>, one of > > the > > > > > > future > > > > > > goals was to integrate more closely with Parquet, and having the > > spec > > > > at > > > > > > Parquet will help to speed this up. > > > > > > > > > > > > Kind regards, > > > > > > Fokko > > > > > > > > > > > > Op vr 23 aug 2024 om 11:37 schreef Gábor Szádovszky < > > > ga...@apache.org > > > > >: > > > > > > > > > > > > > Hi Gang, > > > > > > > > > > > > > > Thanks for bringing this up. > > > > > > > > > > > > > > I think that if Variant type would have come up earlier (before > > > > > > > iceberg/arrow), its natural place would have been at the file > > > format > > > > > > level > > > > > > > as any other types. The communities started discussing where it > > > > should > > > > > be > > > > > > > placed because now we have different type systems at different > > > > places. > > > > > > > Also, the current spec of Variant makes it more or less > > independent > > > > > from > > > > > > > the Parquet file format. > > > > > > > However, even at Parquet level, we would need at least an > > > additional > > > > > > > Logical type to help handle Variant type by the systems > > > > reading/writing > > > > > > > Parquet. > > > > > > > > > > > > > > To summarize my opinion, +1 for having the whole Variant spec > in > > > > > Parquet > > > > > > > format. > > > > > > > > > > > > > > Cheers, > > > > > > > Gabor > > > > > > > > > > > > > > Gang Wu <ust...@gmail.com> ezt írta (időpont: 2024. aug. 23., > P, > > > > > 11:18): > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > Apache Iceberg is adding variant type support [1][2] by > > adopting > > > > the > > > > > > > > variant > > > > > > > > spec [3] from Apache Spark. As the proposal is getting > mature, > > > both > > > > > > > Iceberg > > > > > > > > [4] > > > > > > > > and Spark [5] communities are discussing moving the variant > > type > > > to > > > > > > > Parquet > > > > > > > > repo to avoid divergence. Moving it into Parquet makes the > > > variant > > > > > spec > > > > > > > > engine > > > > > > > > and table format agnostic, which may encourage wider > adoption. > > > > > > > > > > > > > > > > What do people from Parquet community think? > > > > > > > > > > > > > > > > [1] > > > > https://lists.apache.org/thread/xnyo1k66dxh0ffpg7j9f04xgos0kwc34 > > > > > > > > [2] > > > > https://lists.apache.org/thread/xcyytoypgplfr74klg1z2rgjo6k5b0sq > > > > > > > > [3] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/spark/blob/d84f1a3575c4125009374521d2f179 > > 089ebd71ad/common/variant/README.md > > > > > > > > [4] > > > > https://lists.apache.org/thread/hopkr2f0ftoywwt9zo3jxb7n0ob5s5bw > > > > > > > > [5] > > > > https://lists.apache.org/thread/0k5oj3mn0049fcxoxm3gx3d7r28gw4rj > > > > > > > > > > > > > > > > Best, > > > > > > > > Gang > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > Ryan Blue > > Databricks > > >