One thing to mention here is that wide tables are not the only use case for better footers. Thrift is slow. In our (Salesforce) lake, we suffer quite a latency hit for cold tables just because thrift deserialization takes quite some time. Even though we also have a custom thrift parser that doesn't do allocations. But still, parsing footers takes time. If you just want to do a point access, parsing the footer can take 100x longer than the actual access. Icebergs can often have small files if they weren't compacted yet. For these, the footer is often the largest part of the file. So having something that allows you to skip some parts of the footer would be nice, but just having the footer be faster would be way better than that.
Cheers, Jan Am Do., 26. März 2026 um 14:57 Uhr schrieb Ed Seidl <[email protected]>: > Yes, thank you. Perhaps something like > https://github.com/apache/arrow-rs/pull/8714 would work ;-) > > I've been looking for an excuse to revisit this. > > Ed > > On 2026/03/26 09:38:21 Andrew Lamb wrote: > > Thank you Will, > > > > I think this idea is very interesting. > > > > > it’s entirely possible this index-only approach was evaluated and > > discarded early on. > > > > I don't know of any such discussion or evaluation. One challenge of this > > approach might be getting thrift-compiler generated parsers to cooperate. > > > > > would love to hear your thoughts on whether a lightweight index could > > give us the O(1) read performance we want without the file bloat we are > > trying to avoid. > > > > I personally think it sounds like a great idea -- and I suggest as a next > > step a prototype in one of the open source parquet readers so we can > > measure its performance. The arrow-rs implementation (now) has its own > > custom thrift decoder, so it might be relatively easy to test out such a > > scheme. > > > > Andrew > > > > > > > > > > On Thu, Mar 26, 2026 at 5:19 AM Will Edwards via dev < > [email protected]> > > wrote: > > > > > Hi everyone, > > > > > > I have only recently joined the online community, so please accept my > > > apologies for jumping in with this input so late in the process. I've > been > > > catching up on the discussions around the parquet3.fbs FlatBuffer > footer > > > proposal, and first, I want to say I completely agree with the core > > > motivation: Parquet desperately needs O(1) random access to column > metadata > > > to avoid the linear scanning and heap pressure that currently penalize > > > wide-table workloads. > > > > > > I write Parquet parsers, and my profiling aligns with what many here > have > > > likely seen. As demonstrated in the *arrow-rs* blog post about custom > > > Thrift parsing, simply decoding but discarding (skipping heap > allocation > > > for) unneeded columns yields a 3x-9x speedup. This highlights that the > > > historic cost of Thrift parsing has been as much a code problem as a > data > > > problem. If we can introduce a way to completely *skip* rather than > > > *decode* > > > the Thrift bytes of unwanted columns, we can make things faster still. > > > > > > Given the ongoing challenges with file bloat and metadata duplication > in > > > the current FlatBuffer proposal, I wanted to float an alternative > > > architectural approach: *What if we keep the Thrift footer intact as > the > > > single source of truth, but append a lightweight, O(1) "Footer Index"?* > > > The Core Proposal: A Lightweight Offset Index > > > > > > Instead of duplicating ColumnMetaData (statistics, encodings, etc.) > into a > > > massive new FlatBuffer structure, we could append a highly compact jump > > > table. For any given row group and column ordinal, this index would > simply > > > tell the engine exactly where that column's metadata begins inside the > > > legacy Thrift footer. > > > > > > A query engine would simply: > > > > > > 1. > > > > > > 2. Consult the new Footer Index to get the exact byte offsets of the > > > target column's metadata across the required row groups. > > > 2. > > > > > > Because this parsing is highly targeted, the parser can employ a > low- or > > > zero-allocation pattern to save those very last CPU cycles. > > > 3. > > > > > > Jump the instruction pointer directly to those bytes in the legacy > > > Thrift footer and decode only the requested ColumnMetaData structs. > > > Because this parsing is highly targeted, the parser can employ a > low- or > > > zero-allocation pattern to save those very last CPU cycles. > > > > > > Optional Add-On: O(1) Access by Name > > > > > > For engines or use cases that don't rely on external catalogs or ID > > > mappings, we could easily extend this concept to natively support O(1) > > > column resolution by name. > > > > > > We could bake a lightweight hash table into the index that simply maps > node > > > names to their ordinals, and struct names to their corresponding > groups of > > > ordinals. This provides immediate, zero-scan access to any field by > name > > > while keeping the footprint microscopically small compared to > duplicating > > > the entire metadata payload. > > > Potential Benefits > > > > > > - > > > > > > *Solves the Allocation Bottleneck:* It provides the exact O(1) > random > > > access needed to skip unwanted columns, entirely eliminating the > linear > > > scanning and garbage collection overhead. > > > - > > > > > > *Drastically Smaller File Sizes:* We avoid the 3x-4x file bloat > > > currently seen when translating Thrift metadata directly to > > > FlatBuffers. We > > > are only storing an index of integers (and optionally a small hash > > > table). > > > - > > > > > > *Single Source of Truth:* We avoid "dual parser" drift. Writers > don't > > > have to serialize two complete metadata payloads, and the format > doesn't > > > have to define complex rules about which footer takes precedence if > they > > > disagree. > > > > > > I realize there is already significant momentum and code written for > > > parquet3.fbs, and it’s entirely possible this index-only approach was > > > evaluated and discarded early on. If so, I’d love to understand the > > > technical hurdles it faced (e.g., were there issues safely > instantiating a > > > Thrift decoder mid-stream?). > > > > > > I appreciate all the hard work going into modernizing the format and > would > > > love to hear your thoughts on whether a lightweight index could give > us the > > > O(1) read performance we want without the file bloat we are trying to > > > avoid. > > > > > > Thanks, > > > > > > Will > > > > > >
