Hi Julien.

Thank you for reconnecting the threads.

I have broken down my experiments in a narrative, commit by commit on how
we can go from flatbuffers being ~2x larger than thrift to being smaller
(and at times even half) the size of thrift. This is still on an internal
branch, I will resume work towards the end of this month to port it to
arrow so that folks can look at the progress and share ideas.

On the benchmarking front I need to build and share a binary for third
parties to donate their footers for analysis.

The PR for parquet extensions has gotten a few rounds of reviews:
https://github.com/apache/parquet-format/pull/254. I hope it will be merged
soon.

I missed the sync yesterday - for some reason I didn't receive an
invitation. Julien could you add me again to the invite list?

On Thu, Aug 15, 2024 at 1:32 AM Julien Le Dem <jul...@apache.org> wrote:

> This came up in the sync today.
>
> There are a few concurrent experiments with flatbuffers for a future
> Parquet footer replacement. In itself it is fine and just wanted to
> reconnect the threads here so that folks are aware of each other and can
> share findings.
>
> - Neelaksh benchmarking and experiments:
>
> https://medium.com/@neelaksh-singh/benchmarking-apache-parquet-my-mid-program-journey-as-an-mlh-fellow-bc0b8332c3b1
> https://github.com/Neelaksh-Singh/gresearch_parquet_benchmarking
>
> - Alkis has also been experimenting and led the proposal for enabling
> extending the existing footer.
>
> https://docs.google.com/document/d/1KkoR0DjzYnLQXO-d0oRBv2k157IZU0_injqd4eV4WiI/edit#heading=h.15ohoov5qqm6
>
> - Xuwei also shared that he is looking into this.
>
> I would suggest that you all reply to this thread sharing your current
> progress or ideas and a link to your respective repos for experimenting.
>
> Best
> Julien
>

Reply via email to