We also were discussing trying to implement variant in Rust[1], but it was
hard due to a lack of other implementations or example data to test against

Maybe once there is a draft POC for Java, we could whip up something for
Rust that did the same

[1]: https://github.com/apache/arrow-rs/issues/6736

On Wed, Dec 4, 2024 at 4:57 AM Gang Wu <ust...@gmail.com> wrote:

> > With regards to Variant implementations, for Java, don't we need the
> format
> > released before the implementation can be provided (I thought
> parquet-java
> > consumed a released parquet-format jar in its build)?
>
> For parquet-java, usually the PoC PR is based on a locally built
> parquet-format
> with an unreleased version when the spec change is under review. Once the
> vote
> has been passed and a new parquet-format is released, the PoC PR gets
> rebased
> on the released format for a final review. Below are some examples:
>
> float16: https://github.com/apache/parquet-java/pull/1142
> size stats: https://github.com/apache/parquet-java/pull/1177
> geometry: https://github.com/apache/parquet-java/pull/2971
>
> Best,
> Gang
>
> On Wed, Dec 4, 2024 at 2:57 PM Micah Kornfield <emkornfi...@gmail.com>
> wrote:
>
> > Hi Gene,
> >
> > Before release, I added a proposal to have a shredding version added to
> the
> > annotation (https://github.com/apache/parquet-format/pull/474), it would
> > be
> > good to discuss if people think there is value in this.
> >
> >
> >
> > > However, there was a discussion [2] on the requirement of two PoC
> > reference
> > > implementations when promoting a new format change.
> >
> >
> > With regards to Variant implementations, for Java, don't we need the
> format
> > released before the implementation can be provided (I thought
> parquet-java
> > consumed a released parquet-format jar in its build)?
> >
> >
> > > However, there was a discussion [2] on the requirement of two PoC
> > reference
> > > implementations when promoting a new format change. There are also
> > concerns
> > > from the variant logical type PR [3] against parquet-java. This is
> > > something to
> > > discuss in the community if we want to make the variant type an
> > exception.
> >
> >
> > I thought the compromise we came to is that the documentation  for
> Variant
> > states that it is still experimental (maybe we should add this as a
> comment
> > to parquet.thrift as well to make this very clear) . I was under the
> > impression that Variant would stay experimental until the 2
> implementations
> > were complete.  I think we should clarify the scope of what we think is
> > acceptable for the implementations but that should probably be a separate
> > thread).  I also have some concerns about some current variant spec after
> > reviewing initial spec and the proposed shredding simplification [1],
> which
> > I'll raise on a separate thread.
> >
> > Thanks,
> > Micah
> >
> > [1] https://github.com/apache/parquet-format/pull/461
> >
> >
> >
> > On Tue, Dec 3, 2024 at 10:28 PM Gang Wu <ust...@gmail.com> wrote:
> >
> > > Hi Gene,
> > >
> > > Thanks for your effort on adding variant type to the parquet-format!
> For
> > > the next
> > > release, I'd like to include the geometry type [1] as well which is
> also
> > > targeted
> > > for the Iceberg V3 spec. I can volunteer to be the release manager.
> > >
> > > However, there was a discussion [2] on the requirement of two PoC
> > reference
> > > implementations when promoting a new format change. There are also
> > concerns
> > > from the variant logical type PR [3] against parquet-java. This is
> > > something to
> > > discuss in the community if we want to make the variant type an
> > exception.
> > >
> > > [1] https://github.com/apache/parquet-format/pull/240
> > > [2] https://lists.apache.org/thread/f9379yx0lf5gtpkgyv922pvowtzy4kmm
> > > [3] https://github.com/apache/parquet-java/pull/3072
> > >
> > > Best,
> > > Gang
> > >
> > > On Wed, Dec 4, 2024 at 2:08 PM Gene Pang <gene.p...@gmail.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > We updated parquet-format <https://github.com/apache/parquet-format>
> > to
> > > > include the Variant logical type annotation. Would someone be able to
> > > > release parquet-format (and create the necessary artifacts) so that
> > > > parquet-java can be updated to depend on the new release? This would
> > > enable
> > > > adding implementation in parquet-java.
> > > >
> > > > Thanks!
> > > > Gene
> > > >
> > >
> >
>

Reply via email to