Notes

   -

   Julien Le Dem <[email protected]> Datadog: follow up on encodings
   and flatbuff footer, meeting timing
   -

   Rok Mihevc - Arctos Alliance, listening in, curious about flat buffer
   metadata progress


   -

   Andrew Lamb (InfluxData) - Variant and Geospatial Blogs (would like to
   hear about ALP progress if any)
   -

   Kenny Daniel - Hyperparam
   -

   Ben Owad - Snowflake - listening
   -

   Jiayi Wang - Databricks - listening
   -

   Aihua Xu - Snowflake - Variant Blog and listening The Evolution of
   Semi-Structured Data: Introducing Variant in Apache Parquet
   
<https://docs.google.com/document/d/1ABr3p-xj_8rHQ2kdzzDSceejGkU0nWnriZGPjhDoDBc/edit?tab=t.0>
   -

   Anurag Mantripragada - Apple - Wanted to share Efficient Column Updates
   in Iceberg
   
<https://docs.google.com/document/d/1Bd7JVzgajA8-DozzeEE24mID_GLuz6iwj0g4TlcVJcs/edit?tab=t.0>
   and listening in (will have to drop in 30 minutes)
   -

   Arnav Balyan - Uber - FSST Spec Parquet FSST Support: Specification
   
<https://docs.google.com/document/d/1Xg2b8HR19QnI3nhtQUDWZJhCLwJzW6y9tU1ziiLFZrM/edit?tab=t.0#heading=h.a9r0tnd6fhtq>
   -

   Jiaying Li - CMU - listening
   -

   Russell Spitzer - Snowflake - Column Updates - listening

Notes:

   -

   Announcements:
   -

      Welcome Andrew to the PMC!
      -

   Sharing Efficient Column Updates in Iceberg
   
<https://docs.google.com/document/d/1Bd7JVzgajA8-DozzeEE24mID_GLuz6iwj0g4TlcVJcs/edit?tab=t.0>

   -

      Column update is a single parquet file with a subset of columns
      -

      Stitching columns on read.
      -

      Should this be in Iceberg or Parquet? Document describes both options
      with pros and cons.
      -

         Russel: Iceberg
         -

      Sync: Tuesday 9am PT. on the iceberg dev.
      -


         https://iceberg.apache.org/community/#apache-iceberg-community-calendar
         -

      Parquet:
      -

   Blogs!
   -

      Variant: The Evolution of Semi-Structured Data: Introducing Variant
      in Apache Parquet
      
<https://docs.google.com/document/d/1ABr3p-xj_8rHQ2kdzzDSceejGkU0nWnriZGPjhDoDBc/edit?tab=t.0>
      -

      Geospatial types: Parquet Geo data type blog post
      
<https://docs.google.com/document/d/1JPK0F6Vn4sjXGO4AzrkywOjlj_ybDV_6v0zuaFIIjlk/edit?tab=t.0#heading=h.f5ymbunigmpp>
      -

      Ask: please read and provide feedback if you are interested
      -

      Once docs have settled down we will turn them into markdown and post
      to https://parquet.apache.org/
      -

   Meeting timing
   -

      Will shift the meeting by a week. Next in 3 weeks.
      -

   Updates:
   -

      Encodings
      -

         FSST: Arnav Parquet FSST Support: Specification
         
<https://docs.google.com/document/d/1Xg2b8HR19QnI3nhtQUDWZJhCLwJzW6y9tU1ziiLFZrM/edit?tab=t.0#heading=h.a9r0tnd6fhtq>
         -

            Great comments on the proposal, spec released.
            -

            Questions:
            -

               Is the table in each page or in the dictionary page?
               -

               Preferred => dictionary page to start with
               -

            Spec will need review:
            -

               Everyone please review!!
               -

               In particular: Julien Le Dem <[email protected]>
               [email protected], [email protected]
               -

         ALP:[Parquet] ALP Spec.docx
         
<https://docs.google.com/document/d/1xz2cudDpN2Y1ImFcTXh15s-3fPtD_aWt/edit>
         -

            All feedback has been addressed
            -

            Except:
            -

               Exceptions encoding is still being discussed.
               -

                  TODO: need help finalizing that decision
                  -

                  Andrew: I agree that optimizing for a large number of
                  exceptions is necessary -- as ALP is not going to be
a good choice in that
                  case where there are a large number of exceptions
                  -

               Version field in alp header or not?
               -

                  Goal to
                  -

                     customize integer encoding
                     -

                     Alp-rd
                     -

                  Question of using new enum instead?
                  -

                  TODO: help finalizing that decision point in the doc
                  -

            Kenny: Note on ALP is that hyparquet has a branch with
            experimental support, but would really benefit from some example
            parquet-testing files.
            -

            Need example files for other implementations.
            -

               Is it easy to generate files with the cpp implementation
               -

               TODO: utility to generate file in CPP.
               -

      flatbuff footer:
      -

         Jiayi: comments on the spec have been addressed and tested
         internally
         -

         TODO:
         -

            sync in the OSS PR.
            -

            Encryption is added to the spec but no implementation so far
            -

               Need review: Rok volunteering.
               -

            Send final reminder to mailing list


On Tue, Feb 3, 2026 at 4:49 PM Julien Le Dem <[email protected]> wrote:

> The next Parquet sync is tomorrow Wednesday Feb 4th at 10am PT - 1pm ET -
> 7pm CET
>
> To join the invite, join the group:
> https://groups.google.com/g/apache-parquet-community-sync
>
> Everybody is welcome, bring your topic or just listen in.
>
> (Some more details on how the meeting is run:
> https://lists.apache.org/thread/bjdkscmx7zvgfbw0wlfttxy8h6v3f71t )
>

Reply via email to