>Does anyone have any questions or concerns about getting this out? Gang, you mentioned that you would like to volunteer as release manager, are you still available? :)
I think we should close on if we want versioning (and what sort) we want for variant [1] in the thrift header. I'd also prefer to wait on releasing a variant until there is an official ratification (it seems like it should be close) in parquet? It seems like people might get confused on status if they aren't reading the docs carefully on current support levels? Thanks, Micah [1] https://github.com/apache/parquet-format/pull/474 On Wed, Feb 19, 2025 at 10:51 PM Gang Wu <ust...@gmail.com> wrote: > If there is no objection, I will prepare the release candidate of > parquet-format 2.11.0 and send out the vote early next week. > > On Mon, Feb 17, 2025 at 8:47 PM Gang Wu <ust...@gmail.com> wrote: > > > Thanks Fokko for bringing this up! Yes, I can be the release manager > > if the community reaches a consensus. > > > > Best, > > Gang > > > > On Mon, Feb 17, 2025 at 6:58 PM Fokko Driesprong <fo...@apache.org> > wrote: > > > >> Hey everyone, > >> > >> I would love to bubble this back up to the top of our mailboxes. > >> > >> - For Variant, various implementations are in flight: Java in > >> Parquet-Java <https://github.com/apache/parquet-java/pull/3117> and > >> Iceberg-Java <https://github.com/apache/iceberg/pull/12139>, C++ > >> <https://github.com/apache/arrow/pull/45375> in Arrow, Python > >> < > >> > https://github.com/apache/spark/blob/master/python/pyspark/sql/variant_utils.py > >> > > >> in Spark, and the Arrow Rust community also expressed interest > >> <https://github.com/apache/arrow-rs/issues/6736>. > >> - For Geometry/Geography, we see a C++ PR > >> <https://github.com/apache/arrow/pull/45459> in Arrow, Java in > Parquet > >> <https://github.com/apache/parquet-java/pull/2971>, but the vote has > >> just passed last week. We also see that geo support has been added to > >> Iceberg <https://github.com/apache/iceberg/pull/10981>. > >> > >> Both Variant and Geo have been voted for and merged in the format spec. > To > >> maintain momentum I think it would be good to get the thrift definitions > >> and the Java convenience JAR out. > >> > >> Does anyone have any questions or concerns about getting this out? Gang, > >> you mentioned that you would like to volunteer as release manager, are > >> you still available? :) > >> > >> Kind regards, > >> Fokko > >> > >> > >> Op do 5 dec 2024 om 05:33 schreef Gene Pang <gene.p...@gmail.com>: > >> > >> > I see, thanks for the clarifications! > >> > > >> > I will work on porting the Spark Java implementation to parquet-java. > >> > > >> > Spark also has a (partial) python implementation for the Variant > binary > >> > format, but it needs a bit more work to complete. > >> > > >> > Thanks, > >> > Gene > >> > > >> > On Wed, Dec 4, 2024 at 6:11 AM Andrew Lamb <andrewlam...@gmail.com> > >> wrote: > >> > > >> > > We also were discussing trying to implement variant in Rust[1], but > it > >> > was > >> > > hard due to a lack of other implementations or example data to test > >> > against > >> > > > >> > > Maybe once there is a draft POC for Java, we could whip up something > >> for > >> > > Rust that did the same > >> > > > >> > > [1]: https://github.com/apache/arrow-rs/issues/6736 > >> > > > >> > > On Wed, Dec 4, 2024 at 4:57 AM Gang Wu <ust...@gmail.com> wrote: > >> > > > >> > > > > With regards to Variant implementations, for Java, don't we need > >> the > >> > > > format > >> > > > > released before the implementation can be provided (I thought > >> > > > parquet-java > >> > > > > consumed a released parquet-format jar in its build)? > >> > > > > >> > > > For parquet-java, usually the PoC PR is based on a locally built > >> > > > parquet-format > >> > > > with an unreleased version when the spec change is under review. > >> Once > >> > the > >> > > > vote > >> > > > has been passed and a new parquet-format is released, the PoC PR > >> gets > >> > > > rebased > >> > > > on the released format for a final review. Below are some > examples: > >> > > > > >> > > > float16: https://github.com/apache/parquet-java/pull/1142 > >> > > > size stats: https://github.com/apache/parquet-java/pull/1177 > >> > > > geometry: https://github.com/apache/parquet-java/pull/2971 > >> > > > > >> > > > Best, > >> > > > Gang > >> > > > > >> > > > On Wed, Dec 4, 2024 at 2:57 PM Micah Kornfield < > >> emkornfi...@gmail.com> > >> > > > wrote: > >> > > > > >> > > > > Hi Gene, > >> > > > > > >> > > > > Before release, I added a proposal to have a shredding version > >> added > >> > to > >> > > > the > >> > > > > annotation (https://github.com/apache/parquet-format/pull/474), > >> it > >> > > would > >> > > > > be > >> > > > > good to discuss if people think there is value in this. > >> > > > > > >> > > > > > >> > > > > > >> > > > > > However, there was a discussion [2] on the requirement of two > >> PoC > >> > > > > reference > >> > > > > > implementations when promoting a new format change. > >> > > > > > >> > > > > > >> > > > > With regards to Variant implementations, for Java, don't we need > >> the > >> > > > format > >> > > > > released before the implementation can be provided (I thought > >> > > > parquet-java > >> > > > > consumed a released parquet-format jar in its build)? > >> > > > > > >> > > > > > >> > > > > > However, there was a discussion [2] on the requirement of two > >> PoC > >> > > > > reference > >> > > > > > implementations when promoting a new format change. There are > >> also > >> > > > > concerns > >> > > > > > from the variant logical type PR [3] against parquet-java. > This > >> is > >> > > > > > something to > >> > > > > > discuss in the community if we want to make the variant type > an > >> > > > > exception. > >> > > > > > >> > > > > > >> > > > > I thought the compromise we came to is that the documentation > for > >> > > > Variant > >> > > > > states that it is still experimental (maybe we should add this > as > >> a > >> > > > comment > >> > > > > to parquet.thrift as well to make this very clear) . I was under > >> the > >> > > > > impression that Variant would stay experimental until the 2 > >> > > > implementations > >> > > > > were complete. I think we should clarify the scope of what we > >> think > >> > is > >> > > > > acceptable for the implementations but that should probably be a > >> > > separate > >> > > > > thread). I also have some concerns about some current variant > >> spec > >> > > after > >> > > > > reviewing initial spec and the proposed shredding simplification > >> [1], > >> > > > which > >> > > > > I'll raise on a separate thread. > >> > > > > > >> > > > > Thanks, > >> > > > > Micah > >> > > > > > >> > > > > [1] https://github.com/apache/parquet-format/pull/461 > >> > > > > > >> > > > > > >> > > > > > >> > > > > On Tue, Dec 3, 2024 at 10:28 PM Gang Wu <ust...@gmail.com> > wrote: > >> > > > > > >> > > > > > Hi Gene, > >> > > > > > > >> > > > > > Thanks for your effort on adding variant type to the > >> > parquet-format! > >> > > > For > >> > > > > > the next > >> > > > > > release, I'd like to include the geometry type [1] as well > >> which is > >> > > > also > >> > > > > > targeted > >> > > > > > for the Iceberg V3 spec. I can volunteer to be the release > >> manager. > >> > > > > > > >> > > > > > However, there was a discussion [2] on the requirement of two > >> PoC > >> > > > > reference > >> > > > > > implementations when promoting a new format change. There are > >> also > >> > > > > concerns > >> > > > > > from the variant logical type PR [3] against parquet-java. > This > >> is > >> > > > > > something to > >> > > > > > discuss in the community if we want to make the variant type > an > >> > > > > exception. > >> > > > > > > >> > > > > > [1] https://github.com/apache/parquet-format/pull/240 > >> > > > > > [2] > >> > https://lists.apache.org/thread/f9379yx0lf5gtpkgyv922pvowtzy4kmm > >> > > > > > [3] https://github.com/apache/parquet-java/pull/3072 > >> > > > > > > >> > > > > > Best, > >> > > > > > Gang > >> > > > > > > >> > > > > > On Wed, Dec 4, 2024 at 2:08 PM Gene Pang <gene.p...@gmail.com > > > >> > > wrote: > >> > > > > > > >> > > > > > > Hi, > >> > > > > > > > >> > > > > > > We updated parquet-format < > >> > > https://github.com/apache/parquet-format> > >> > > > > to > >> > > > > > > include the Variant logical type annotation. Would someone > be > >> > able > >> > > to > >> > > > > > > release parquet-format (and create the necessary artifacts) > so > >> > that > >> > > > > > > parquet-java can be updated to depend on the new release? > This > >> > > would > >> > > > > > enable > >> > > > > > > adding implementation in parquet-java. > >> > > > > > > > >> > > > > > > Thanks! > >> > > > > > > Gene > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > > >