Re: Custom Rust Parquet thrift parser (was [DISCUSS] flatbuf footer)

2025-10-24 Thread Andrew Bell
On Thu, Oct 23, 2025 at 1:54 PM Andrew Lamb  wrote:

> Breaking this off into its own thread.
>
> In case anyone is interested, I just published a blog[1] post about the new
> metadata decoder we released for the Rust implementation of Parquet that
> explains background, results we achieved, and how it works.
>

I might be interested in doing something similar for C++ if there's
interest (the parser, not the blog ;)

-- 
Andrew Bell
[email protected]


Re: Custom Rust Parquet thrift parser (was [DISCUSS] flatbuf footer)

2025-10-24 Thread Andrew Bell
On Fri, Oct 24, 2025 at 5:54 AM Andrew Lamb  wrote:

> Hi Andrew,
>
> Thrift messages are defined in the Thrift Interface Definition Language[1]
> and the binary encoding used in Parquet is the "Thrift Compact Protocol"
> [2].
>

Thanks. I looked through the docs on the Thrift website, but didn't know
about other docs in github.

-- 
Andrew Bell
[email protected]


Re: Custom Rust Parquet thrift parser (was [DISCUSS] flatbuf footer)

2025-10-24 Thread Andrew Lamb
Hi Andrew,

Thrift messages are defined in the Thrift Interface Definition Language[1]
and the binary encoding used in Parquet is the "Thrift Compact Protocol"
[2].

There is some additional detail in the "Background: Apache Thrift" section
of the blog[3] for anyone curious for more details

Andrew

[1]: https://thrift.apache.org/docs/idl
[2]:
https://github.com/apache/thrift/blob/master/doc/specs/thrift-compact-protocol.md
[3]: https://arrow.apache.org/blog/2025/10/23/rust-parquet-metadata/




On Thu, Oct 23, 2025 at 6:25 PM Andrew Bell 
wrote:

> Hi Andrew,
>
> I'm having trouble locating documentation of the binary encoding of Thrift.
> Can you point me to it?
>
> Thanks,
>
> On Thu, Oct 23, 2025 at 1:54 PM Andrew Lamb 
> wrote:
>
> > Breaking this off into its own thread.
> >
> > In case anyone is interested, I just published a blog[1] post about the
> new
> > metadata decoder we released for the Rust implementation of Parquet that
> > explains background, results we achieved, and how it works.
> >
> > Andrew
> >
> > [1]: https://arrow.apache.org/blog/2025/10/23/rust-parquet-metadata/
>
>
>
> --
> Andrew Bell
> [email protected]
>


Re: Custom Rust Parquet thrift parser (was [DISCUSS] flatbuf footer)

2025-10-23 Thread Andrew Bell
Hi Andrew,

I'm having trouble locating documentation of the binary encoding of Thrift.
Can you point me to it?

Thanks,

On Thu, Oct 23, 2025 at 1:54 PM Andrew Lamb  wrote:

> Breaking this off into its own thread.
>
> In case anyone is interested, I just published a blog[1] post about the new
> metadata decoder we released for the Rust implementation of Parquet that
> explains background, results we achieved, and how it works.
>
> Andrew
>
> [1]: https://arrow.apache.org/blog/2025/10/23/rust-parquet-metadata/



-- 
Andrew Bell
[email protected]