We also were discussing trying to implement variant in Rust[1], but it was hard due to a lack of other implementations or example data to test against
Maybe once there is a draft POC for Java, we could whip up something for Rust that did the same [1]: https://github.com/apache/arrow-rs/issues/6736 On Wed, Dec 4, 2024 at 4:57 AM Gang Wu <ust...@gmail.com> wrote: > > With regards to Variant implementations, for Java, don't we need the > format > > released before the implementation can be provided (I thought > parquet-java > > consumed a released parquet-format jar in its build)? > > For parquet-java, usually the PoC PR is based on a locally built > parquet-format > with an unreleased version when the spec change is under review. Once the > vote > has been passed and a new parquet-format is released, the PoC PR gets > rebased > on the released format for a final review. Below are some examples: > > float16: https://github.com/apache/parquet-java/pull/1142 > size stats: https://github.com/apache/parquet-java/pull/1177 > geometry: https://github.com/apache/parquet-java/pull/2971 > > Best, > Gang > > On Wed, Dec 4, 2024 at 2:57 PM Micah Kornfield <emkornfi...@gmail.com> > wrote: > > > Hi Gene, > > > > Before release, I added a proposal to have a shredding version added to > the > > annotation (https://github.com/apache/parquet-format/pull/474), it would > > be > > good to discuss if people think there is value in this. > > > > > > > > > However, there was a discussion [2] on the requirement of two PoC > > reference > > > implementations when promoting a new format change. > > > > > > With regards to Variant implementations, for Java, don't we need the > format > > released before the implementation can be provided (I thought > parquet-java > > consumed a released parquet-format jar in its build)? > > > > > > > However, there was a discussion [2] on the requirement of two PoC > > reference > > > implementations when promoting a new format change. There are also > > concerns > > > from the variant logical type PR [3] against parquet-java. This is > > > something to > > > discuss in the community if we want to make the variant type an > > exception. > > > > > > I thought the compromise we came to is that the documentation for > Variant > > states that it is still experimental (maybe we should add this as a > comment > > to parquet.thrift as well to make this very clear) . I was under the > > impression that Variant would stay experimental until the 2 > implementations > > were complete. I think we should clarify the scope of what we think is > > acceptable for the implementations but that should probably be a separate > > thread). I also have some concerns about some current variant spec after > > reviewing initial spec and the proposed shredding simplification [1], > which > > I'll raise on a separate thread. > > > > Thanks, > > Micah > > > > [1] https://github.com/apache/parquet-format/pull/461 > > > > > > > > On Tue, Dec 3, 2024 at 10:28 PM Gang Wu <ust...@gmail.com> wrote: > > > > > Hi Gene, > > > > > > Thanks for your effort on adding variant type to the parquet-format! > For > > > the next > > > release, I'd like to include the geometry type [1] as well which is > also > > > targeted > > > for the Iceberg V3 spec. I can volunteer to be the release manager. > > > > > > However, there was a discussion [2] on the requirement of two PoC > > reference > > > implementations when promoting a new format change. There are also > > concerns > > > from the variant logical type PR [3] against parquet-java. This is > > > something to > > > discuss in the community if we want to make the variant type an > > exception. > > > > > > [1] https://github.com/apache/parquet-format/pull/240 > > > [2] https://lists.apache.org/thread/f9379yx0lf5gtpkgyv922pvowtzy4kmm > > > [3] https://github.com/apache/parquet-java/pull/3072 > > > > > > Best, > > > Gang > > > > > > On Wed, Dec 4, 2024 at 2:08 PM Gene Pang <gene.p...@gmail.com> wrote: > > > > > > > Hi, > > > > > > > > We updated parquet-format <https://github.com/apache/parquet-format> > > to > > > > include the Variant logical type annotation. Would someone be able to > > > > release parquet-format (and create the necessary artifacts) so that > > > > parquet-java can be updated to depend on the new release? This would > > > enable > > > > adding implementation in parquet-java. > > > > > > > > Thanks! > > > > Gene > > > > > > > > > >