Attendees:

al...@influxdata.com: Andrew, Influx data, Saying Hi!, not causing trouble
(he says)

Micah: Google, saying Hi! Causing trouble (he says)

Raul: Arrow C++/Py release manager

Julien: Datadog

Notes:

   -

   Updates
   -

      Variant: iterating in parquet-format repo
      -

      Footer metadata rewrite
      -

         Need to review individual optimizations
         -

            Relative indices:
            -

               Trade offs:
               -

                  Pro: Smaller metadata
                  -

                  Con: More complex computation
                  -

            Do we need 2 layers of metadata?
            -

               Modular vs one footer?
               -

            How much does the footer size matter?



   -

   Encodings
   -

      How to add future-proof encodings
      -

         Andrew suggested Wasm based plugins:
         -

            Expressivity?
            -

            Security?
            -

         Could be great for experimenting with new encoding.
         -

         Not great for having a standard fully defined format.
         -

            Query engines do integrate decoding with evaluation.
            -

            Would need a clear contract:
            -

               skip(n)
               -

               decode_n_values_to _arrow
               -

               …
               -

            Storing plugin in file? => security issues
            -

      Independent discussion of adding a few encodings
      -

         good Integer encodings for timestamps.
         -

         Good encodings for floats
         -

         …
         -

      Compelling new encodings papers:
      -

         Fastlane paper: 10x improvement?
         -

         BTR blocks: a way to cascade encodings in a better way.


On Wed, Oct 23, 2024 at 8:33 AM Julien Le Dem <jul...@apache.org> wrote:

> The next Parquet sync is today Oct 23rd at 9:30am PT - 12:30pm ET - 6:30pm
> CET
> (in ~ 1h)
> To join the invite:
> https://calendar.app.google/GjNGkjfMYyoBUpaGA
> Please contact me to be added to the recurring invite.
> Everybody is welcome, bring your topic or just listen in.
> Best
> Julien
>

Reply via email to