Hey Viquar, There shouldn't be a read regression here since the data files would have columnar stats which would cover the ability to prune based on partitions (since essentially all the partition transforms are derivations on a source data column). There's been discussions in the sync on if we should keep the partition tuple for manifests and there's nuances on writer requirements if we were to completely rely on column stats, but regardless of if the partition tuple is kept or not, from a pruning perspective we certainly want to keep the same level pruning as we had before; that's a critical property to preserve.
If we model the partition transform as an expression with its own ID, we could then have stats on that expression. e.g. if you have a column ts, and partitioning days(ts), there'd be an expression <http:///> in metadata representing days(ts), and in stats for the data file there'd be a stat entry containing lower(days(ts)) and upper(days(ts)). For a partitioned file, the lower and upper bounds would have to be equal. For a leaf manifest in the root, we'd have the aggregated lower/upper stats which is effectively the same as the partition field summary that exists today. Then in short, a reader could just run data filters and get the same level of pruning as before. Notice that in this modeling we avoid having to tie a manifest to a given partition spec like what happens today. I do think the aspect to get to more of a conclusion on is if we should keep the partition tuple or completely rely on stats on expressions. For reference, from a past v4 sync <https://drive.google.com/file/d/1gv8TrR5xzqqNxek7_sTZkpbwQx1M3dhK/view?usp=sharing&t=2327> discussion on this topic (linked to the time the discussion start). Let me know if that makes sense! On Tue, Dec 30, 2025 at 10:47 AM vaquar khan <[email protected]> wrote: > Hi everyone, > > I’ve been following the recent discussions and design documents regarding > the Adaptive Metadata Tree and Single-File Commits for the V4 Spec. > > While moving to a Root Manifest structure solves the write amplification > issue on S3/GCS, I am concerned about a potential regression in Partition > Pruning efficiency for readers. Specifically, when Data Files are inlined > into the Root Manifest, we lose the explicit partition summary bounds that > existed in the V3 Manifest List. > > Without a standardized way to store lightweight partition stats for these > inlined entries, query planners may be forced to scan significantly more > metadata bytes to perform the same pruning we get for free today. > > *Proposal*: I propose we explicitly standardize a "Compact Partition > Summary" (possibly using Bloom Filters or compressed min/max tuples) within > the Root Manifest entry schema. This would ensure that V4 maintains the > "File Skipping" performance of V3 while gaining the write throughput of the > new tree structure. > > I am drafting a short design doc outlining the schema changes and backward > compatibility implications for this. > > Before I circulate the doc, has there been any consensus on how to handle > partition stats for inlined files in the combined Spitzer/Jahagirdar > proposal? > > Regards, > Viquar Khan > Sr. Data Architect > https://www.linkedin.com/in/vaquar-khan-b695577/ >
