Attendees:

Micah

Fokko: Variant support. Release to fix issue with latest spark. Remove old
hadoop versions. Defining semantic versioning.

Gene: Variant support PRs.

Julien: progress on Variant.

Alkis: No updates from me, code freeze is slowing down progress with
testing the new footer in the fleet.

Notes:

   -

   Parquet 1.14.4 release?
   https://lists.apache.org/thread/zdxcn9o70mfoo6zrvp7nbz9m8lym6x2c
   -

      Dict over 8kB, doesn’t work properly:
      https://github.com/apache/parquet-java/pull/3041
      -

         The dictionary is truncated.
         -

      Introduced in 1.14.1: https://github.com/apache/parquet-java/pull/1278
      -

      Is it a minor or patch level?
      -

      TODO:
      -

         Chime in on the Mailing List thread.
         -

   Hadoop support
   https://lists.apache.org/thread/bvyg48ysp3rkh4ls0lnn597ygpyf6goj
   -

      Dropping old version
      -

      Decoupling more from Hadoop
      -

         Configuration
         -

         File abstraction
         -

            https://github.com/apache/parquet-java/pull/3030
            -

      TODO:
      -

         Chime in on the Mailing List thread.
         -

   Semantic versioning: https://github.com/apache/parquet-site/pull/86
   -

      TODO:
      -

         Review 86
         -

   Variant support progress.
   -

      2 PRs
      -

         Small updates to the spec.
         -

            https://github.com/apache/parquet-format/pull/457
            -

         Adding logical type for variant
         -

            https://github.com/apache/parquet-format/pull/460
            -

         TODO:
         -

            These PRs have been approved by committers and enough time
            given for feedback
            -

            We can merge them and start an implementation
            -

            Fokko to merge PRs.
            -

      Next steps:
      -

         Release parquet-format
         -

         Update dependency in parquet-java and start implementation.


On Wed, Nov 6, 2024 at 7:55 AM Julien Le Dem <jul...@apache.org> wrote:

> The next Parquet sync is today Nov 6th at 9:30am PT - 12:30pm ET - 6:30pm
> CET
> (in ~ 90min)
> To join the invite:
> https://calendar.app.google/rB5qogkDQ58p4wiJ9
> Please contact me to be added to the recurring invite.
> Everybody is welcome, bring your topic or just listen in.
> Best
> Julien
>

Reply via email to