(datafusion-site) branch asf-site updated: Commit build products

github-bot Sat, 05 Apr 2025 11:57:42 -0700

This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/datafusion-site.git



The following commit(s) were added to refs/heads/asf-site by this push:
     new 279e98c  Commit build products
279e98c is described below

commit 279e98c1cab32f2459c5baa96b1c9f4adc017905
Author: Build Pelican (action) <[email protected]>
AuthorDate: Tue Mar 25 15:47:23 2025 +0000

    Commit build products
---
 output/2025/03/21/parquet-pushdown/index.html      | 311 +++++++++++++++++++++
 output/author/xiangpeng-hao.html                   |  57 ++++
 output/category/blog.html                          |  57 ++++
 output/feed.xml                                    |  40 ++-
 output/feeds/all-en.atom.xml                       | 290 ++++++++++++++++++-
 output/feeds/blog.atom.xml                         | 290 ++++++++++++++++++-
 output/feeds/xiangpeng-hao.atom.xml                | 290 ++++++++++++++++++-
 output/feeds/xiangpeng-hao.rss.xml                 |  40 ++-
 output/images/parquet-pushdown/baseline-impl.jpg   | Bin 0 -> 166560 bytes
 output/images/parquet-pushdown/cached-pages.jpg    | Bin 0 -> 37963 bytes
 output/images/parquet-pushdown/new-pipeline.jpg    | Bin 0 -> 106800 bytes
 output/images/parquet-pushdown/parquet-viewer.jpg  | Bin 0 -> 344238 bytes
 .../parquet-pushdown/pushdown-vs-no-pushdown.jpg   | Bin 0 -> 194575 bytes
 output/index.html                                  |  57 ++++
 14 files changed, 1427 insertions(+), 5 deletions(-)

diff --git a/output/2025/03/21/parquet-pushdown/index.html 
b/output/2025/03/21/parquet-pushdown/index.html
new file mode 100644
index 0000000..49c5d9b
--- /dev/null
+++ b/output/2025/03/21/parquet-pushdown/index.html
@@ -0,0 +1,311 @@
+<!doctype html>
+<html class="no-js" lang="en" dir="ltr">
+  <head>
+    <meta charset="utf-8">
+    <meta http-equiv="x-ua-compatible" content="ie=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <title>Efficient Filter Pushdown in Parquet - Apache DataFusion 
Blog</title>
+<link href="/blog/css/bootstrap.min.css" rel="stylesheet">
+<link href="/blog/css/fontawesome.all.min.css" rel="stylesheet">
+<link href="/blog/css/headerlink.css" rel="stylesheet">
+<link href="/blog/highlight/default.min.css" rel="stylesheet">
+<script src="/blog/highlight/highlight.js"></script>
+<script>hljs.highlightAll();</script>  </head>
+  <body class="d-flex flex-column h-100">
+  <main class="flex-shrink-0">
+<!-- nav bar -->
+<nav class="navbar navbar-expand-lg navbar-dark bg-dark" aria-label="Fifth 
navbar example">
+    <div class="container-fluid">
+        <a class="navbar-brand" href="/blog"><img 
src="/blog/images/logo_original4x.png" style="height: 32px;"/> Apache 
DataFusion Blog</a>
+        <button class="navbar-toggler" type="button" data-bs-toggle="collapse" 
data-bs-target="#navbarADP" aria-controls="navbarADP" aria-expanded="false" 
aria-label="Toggle navigation">
+            <span class="navbar-toggler-icon"></span>
+        </button>
+
+        <div class="collapse navbar-collapse" id="navbarADP">
+            <ul class="navbar-nav me-auto mb-2 mb-lg-0">
+                <li class="nav-item">
+                    <a class="nav-link" href="/blog/about.html">About</a>
+                </li>
+                <li class="nav-item">
+                    <a class="nav-link" href="/blog/feed.xml">RSS</a>
+                </li>
+            </ul>
+        </div>
+    </div>
+</nav>    
+
+
+<!-- page contents -->
+<div id="contents">
+    <div class="bg-white p-5 rounded">
+        <div class="col-sm-8 mx-auto">
+          <h1>
+              Efficient Filter Pushdown in Parquet
+          </h1>
+              <p>Posted on: Fri 21 March 2025 by Xiangpeng Hao</p>
+              <style>
+figure {
+  margin: 20px 0;
+}
+
+figure img {
+  display: block;
+  max-width: 80%;
+}
+
+figcaption {
+  font-style: italic;
+  margin-top: 10px;
+  color: #555;
+  font-size: 0.9em;
+  max-width: 80%;
+}
+</style>
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+<p><em>Editor's Note: This blog was first published on <a 
href="https://blog.xiangpeng.systems/posts/parquet-pushdown/";>Xiangpeng Hao's 
blog</a>. Thanks to <a href="https://www.influxdata.com/";>InfluxData</a> for 
sponsoring this work as part of his PhD funding.</em></p>
+<hr/>
+<p>In the <a 
href="https://datafusion.apache.org/blog/2025/03/20/parquet-pruning";>previous 
post</a>, we discussed how <a href="https://datafusion.apache.org/";>Apache 
DataFusion</a> prunes <a href="https://parquet.apache.org/";>Apache Parquet</a> 
files to skip irrelevant <strong>files/row_groups</strong> (sometimes also <a 
href="https://parquet.apache.org/docs/file-format/pageindex/";>pages</a>).</p>
+<p>This post discusses how Parquet readers skip irrelevant 
<strong>rows</strong> while scanning data,
+leveraging Parquet's columnar layout by first reading only filter columns,
+and then selectively reading other columns only for matching rows.</p>
+<h2>Why filter pushdown in Parquet?</h2>
+<p>Below is an example query that reads sensor data with filters on 
<code>date_time</code> and <code>location</code>.
+Without filter pushdown, all rows from <code>location</code>, 
<code>val</code>, and <code>date_time</code> columns are decoded before 
<code>location='office'</code> is evaluated. Filter pushdown is especially 
useful when the filter is selective, i.e., removes many rows.</p>
+<div class="codehilite"><pre><span></span><code><span class="k">SELECT</span> 
<span class="n">val</span><span class="p">,</span> <span 
class="k">location</span> 
+<span class="k">FROM</span> <span class="n">sensor_data</span> 
+<span class="k">WHERE</span> <span class="n">date_time</span> <span 
class="o">&gt;</span> <span class="s1">'2025-03-11'</span> <span 
class="k">AND</span> <span class="k">location</span> <span class="o">=</span> 
<span class="s1">'office'</span><span class="p">;</span>
+</code></pre></div>
+<figure>
+<img alt="Parquet pruning skips irrelevant files/row_groups, while filter 
pushdown skips irrelevant rows. Without filter pushdown, all rows from 
location, val, and date_time columns are decoded before `location='office'` is 
evaluated. Filter pushdown is especially useful when the filter is selective, 
i.e., removes many rows." class="img-responsive" 
src="/blog/images/parquet-pushdown/pushdown-vs-no-pushdown.jpg" width="80%"/>
+<figcaption>
+    Parquet pruning skips irrelevant files/row_groups, while filter pushdown 
skips irrelevant rows. Without filter pushdown, all rows from location, val, 
and date_time columns are decoded before `location='office'` is evaluated. 
Filter pushdown is especially useful when the filter is selective, i.e., 
removes many rows.
+  </figcaption>
+</figure>
+<p>In our setup, sensor data is aggregated by date &mdash; each day has its 
own Parquet file.
+At planning time, DataFusion prunes the unneeded Parquet files, i.e., 
<code>2025-03-10.parquet</code> and <code>2025-03-11.parquet</code>.</p>
+<p>Once the files to read are located, the <a 
href="https://github.com/apache/datafusion/issues/3463";><em>DataFusion's 
current default implementation</em></a> reads all the projected columns 
(<code>sensor_id</code>, <code>val</code>, and <code>location</code>) into 
Arrow RecordBatches, then applies the filters over <code>location</code> to get 
the final set of rows.</p>
+<p>A better approach is called <strong>filter pushdown</strong> with 
<strong>late materialization</strong>, which evaluates filter conditions first 
and only decodes data that passes these conditions.
+In practice, this works by first processing only the filter columns 
(<code>date_time</code> and <code>location</code>), building a boolean mask of 
rows that satisfy our conditions, then using this mask to selectively decode 
only the relevant rows from other columns (<code>sensor_id</code>, 
<code>val</code>). 
+This eliminates the waste of decoding rows that will be immediately filtered 
out.</p>
+<p>While simple in theory, practical implementations often make performance 
worse.</p>
+<h2>How can filter pushdown be slower?</h2>
+<p>At a high level, the Parquet reader first builds a filter mask -- 
essentially a boolean array indicating which rows meet the filter criteria -- 
and then uses this mask to selectively decode only the needed rows from the 
remaining columns in the projection.</p>
+<p>Let's dig into details of <a 
href="https://github.com/apache/arrow-rs/blob/d5339f31a60a4bd8a4256e7120fe32603249d88e/parquet/src/arrow/async_reader/mod.rs#L618-L712";>how
 filter pushdown is implemented</a> in the current Rust Parquet reader 
implementation, illustrated in the following figure.</p>
+<figure>
+<img alt="Implementation of filter pushdown in Rust Parquet readers" 
class="img-responsive" src="/blog/images/parquet-pushdown/baseline-impl.jpg" 
with="70%"/>
+<figcaption>
+    Implementation of filter pushdown in Rust Parquet readers -- the first 
phase builds the filter mask, the second phase applies the filter mask to the 
other columns
+  </figcaption>
+</figure>
+<p>The filter pushdown has two phases:</p>
+<ol>
+<li>
+<p>Build the filter mask (steps 1-3)</p>
+</li>
+<li>
+<p>Use the filter mask to selectively decode other columns (steps 4-7), e.g., 
output step 3 is used as input for step 5 and 7.</p>
+</li>
+</ol>
+<p>Within each phase, it takes three steps from Parquet to Arrow:</p>
+<ol>
+<li>
+<p>Decompress the Parquet pages using generic decompression algorithms like 
LZ4, Zstd, etc. (steps 1, 4, 6)</p>
+</li>
+<li>
+<p>Decode the page content into Arrow format (steps 2, 5, 7)</p>
+</li>
+<li>
+<p>Evaluate the filter over Arrow data (step 3)</p>
+</li>
+</ol>
+<p>In the figure above, we can see that <code>location</code> is 
<strong>decompressed and decoded twice</strong>, first when building the filter 
mask (steps 1, 2), and second when building the output (steps 4, 5).
+This happens for all columns that appear both in the filter and output.</p>
+<p>The table below shows the corresponding CPU time on the <a 
href="https://github.com/apache/datafusion/blob/main/benchmarks/queries/clickbench/queries.sql#L23";>ClickBench
 query 22</a>:</p>
+<div class="codehilite"><pre><span></span><code><span 
class="err">+------------+--------+-------------+--------+</span>
+<span class="err">| Decompress | Decode | Apply filter| Others |</span>
+<span class="err">+------------+--------+-------------+--------+</span>
+<span class="err">| 206 ms     | 117 ms | 22 ms       | 48 ms  |</span>
+<span class="err">+------------+--------+-------------+--------+</span>
+</code></pre></div>
+<p>Clearly, decompress/decode operations dominate the time spent. With filter 
pushdown, it needs to decompress/decode twice; but without filter pushdown, it 
only needs to do this once.
+This explains why filter pushdown is slower in some cases.</p>
+<blockquote>
+<p><strong>Note:</strong> Highly selective filters may skip the entire page; 
but as long as it reads one row from the page, it needs to decompress and often 
decode the entire page.</p>
+</blockquote>
+<h2>Attempt: cache filter columns</h2>
+<p>Intuitively, caching the filter columns and reusing them later could 
help.</p>
+<p>But naively caching decoded pages consumes prohibitively high memory:</p>
+<ol>
+<li>
+<p>It needs to cache Arrow arrays, which are on average <a 
href="https://github.com/XiangpengHao/liquid-cache/blob/main/dev/doc/liquid-cache-vldb.pdf";>4x
 larger than Parquet data</a>.</p>
+</li>
+<li>
+<p>It needs to cache the <strong>entire column chunk in memory</strong>, 
because in Phase 1 it builds filters over the column chunk, and only use it in 
Phase 2.  </p>
+</li>
+<li>
+<p>The memory usage is proportional to the number of filter columns, which can 
be unboundedly high. </p>
+</li>
+</ol>
+<p>Worse, caching filter columns means it needs to read partially from Parquet 
and partially from cache, which is complex to implement, likely requiring a 
substantial change to the current implementation. </p>
+<blockquote>
+<p><strong>Feel the complexity:</strong> consider building a cache that 
properly handles nested columns, multiple filters, and filters with multiple 
columns.</p>
+</blockquote>
+<h2>Real solution</h2>
+<p>We need a solution that:</p>
+<ol>
+<li>
+<p>Is simple to implement, i.e., doesn't require thousands of lines of 
code.</p>
+</li>
+<li>
+<p>Incurs minimal memory overhead.</p>
+</li>
+</ol>
+<p>This section describes my <a 
href="https://github.com/apache/arrow-rs/pull/6921#issuecomment-2718792433";>&lt;700
 LOC PR (with lots of comments and tests)</a> that <strong>reduces total 
ClickBench time by 15%, with up to 2x lower latency for some queries, no 
obvious regression on other queries, and caches at most 2 pages (~2MB) per 
column in memory</strong>.</p>
+<figure>
+<img alt="New decoding pipeline, building filter mask and output columns are 
interleaved in a single pass, allowing us to cache minimal pages for minimal 
amount of time" class="img-responsive" 
src="/blog/images/parquet-pushdown/new-pipeline.jpg" width="80%"/>
+<figcaption>
+    New decoding pipeline, building filter mask and output columns are 
interleaved in a single pass, allowing us to cache minimal pages for minimal 
amount of time
+  </figcaption>
+</figure>
+<p>The new pipeline interleaves the previous two phases into a single pass, so 
that:</p>
+<ol>
+<li>
+<p>The page being decompressed is immediately used to build filter masks and 
output columns.</p>
+</li>
+<li>
+<p>Decompressed pages are cached for minimal time; after one pass (steps 1-6), 
the cache memory is released for the next pass. </p>
+</li>
+</ol>
+<p>This allows the cache to only hold 1 page at a time, and to immediately 
discard the previous page after it's used, significantly reducing the memory 
requirement for caching.</p>
+<h3>What pages are cached?</h3>
+<p>You may have noticed that only <code>location</code> is cached, not 
<code>val</code>, because <code>val</code> is only used for output.
+More generally, only columns that appear both in the filter and output are 
cached, and at most 1 page is cached for each such column.</p>
+<p>More examples:</p>
+<div class="codehilite"><pre><span></span><code><span class="k">SELECT</span> 
<span class="n">val</span> 
+<span class="k">FROM</span> <span class="n">sensor_data</span> 
+<span class="k">WHERE</span> <span class="n">date_time</span> <span 
class="o">&gt;</span> <span class="s1">'2025-03-11'</span> <span 
class="k">AND</span> <span class="k">location</span> <span class="o">=</span> 
<span class="s1">'office'</span><span class="p">;</span>
+</code></pre></div>
+<p>In this case, no columns are cached, because <code>val</code> is not used 
for filtering.</p>
+<div class="codehilite"><pre><span></span><code><span class="k">SELECT</span> 
<span class="k">COUNT</span><span class="p">(</span><span 
class="o">*</span><span class="p">)</span> 
+<span class="k">FROM</span> <span class="n">sensor_data</span> 
+<span class="k">WHERE</span> <span class="n">date_time</span> <span 
class="o">&gt;</span> <span class="s1">'2025-03-11'</span> <span 
class="k">AND</span> <span class="k">location</span> <span class="o">=</span> 
<span class="s1">'office'</span><span class="p">;</span>
+</code></pre></div>
+<p>In this case, again, no columns are cached, because the output projection 
is empty after query plan optimization.</p>
+<h3>Then why cache 2 pages per column instead of 1?</h3>
+<p>This is another real-world nuance regarding how Parquet layouts the 
pages.</p>
+<p>Parquet by default encodes data using <a 
href="https://parquet.apache.org/docs/file-format/data-pages/encodings/";>dictionary
 encoding</a>, which writes a dictionary page as the first page of a column 
chunk, followed by the keys referencing the dictionary.</p>
+<p>You can see this in action using <a 
href="https://parquet-viewer.xiangpeng.systems";>parquet-viewer</a>:</p>
+<figure>
+<img alt="Parquet viewer shows the page layout of a column chunk" 
class="img-responsive" src="/blog/images/parquet-pushdown/parquet-viewer.jpg" 
width="80%"/>
+<figcaption>
+    Parquet viewer shows the page layout of a column chunk
+  </figcaption>
+</figure>
+<p>This means that to decode a page of data, it actually references two pages: 
the dictionary page and the data page.</p>
+<p>This is why it caches 2 pages per column: one dictionary page and one data 
page.
+The data page slot will move forward as it reads the data; but the dictionary 
page slot always references the first page.</p>
+<figure>
+<img alt="Cached two pages, one for dictionary (pinned), one for data (moves 
as it reads the data)" class="img-responsive" 
src="/blog/images/parquet-pushdown/cached-pages.jpg" width="80%"/>
+<figcaption>
+    Cached two pages, one for dictionary (pinned), one for data (moves as it 
reads the data)
+  </figcaption>
+</figure>
+<h2>How does it perform?</h2>
+<p>Here are my results on <a 
href="https://github.com/apache/datafusion/tree/main/benchmarks#clickbench";>ClickBench</a>
 on my AMD 9900X machine. The total time is reduced by 15%, with Q23 being 
2.24x faster,
+and queries that get slower are likely due to noise.</p>
+<div class="codehilite"><pre><span></span><code><span 
class="err">┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓</span>
+<span class="err">┃ Query        ┃ no-pushdown ┃ new-pushdown ┃        Change 
┃</span>
+<span 
class="err">┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩</span>
+<span class="err">&boxv; QQuery 0     &boxv;      0.47ms &boxv;       0.43ms 
&boxv; +1.10x faster &boxv;</span>
+<span class="err">&boxv; QQuery 1     &boxv;     51.10ms &boxv;      50.10ms 
&boxv;     no change &boxv;</span>
+<span class="err">&boxv; QQuery 2     &boxv;     68.23ms &boxv;      64.49ms 
&boxv; +1.06x faster &boxv;</span>
+<span class="err">&boxv; QQuery 3     &boxv;     90.68ms &boxv;      86.73ms 
&boxv;     no change &boxv;</span>
+<span class="err">&boxv; QQuery 4     &boxv;    458.93ms &boxv;     458.59ms 
&boxv;     no change &boxv;</span>
+<span class="err">&boxv; QQuery 5     &boxv;    522.06ms &boxv;     478.50ms 
&boxv; +1.09x faster &boxv;</span>
+<span class="err">&boxv; QQuery 6     &boxv;     49.84ms &boxv;      49.94ms 
&boxv;     no change &boxv;</span>
+<span class="err">&boxv; QQuery 7     &boxv;     55.09ms &boxv;      55.77ms 
&boxv;     no change &boxv;</span>
+<span class="err">&boxv; QQuery 8     &boxv;    565.26ms &boxv;     556.95ms 
&boxv;     no change &boxv;</span>
+<span class="err">&boxv; QQuery 9     &boxv;    575.83ms &boxv;     575.05ms 
&boxv;     no change &boxv;</span>
+<span class="err">&boxv; QQuery 10    &boxv;    164.56ms &boxv;     178.23ms 
&boxv;  1.08x slower &boxv;</span>
+<span class="err">&boxv; QQuery 11    &boxv;    177.20ms &boxv;     191.32ms 
&boxv;  1.08x slower &boxv;</span>
+<span class="err">&boxv; QQuery 12    &boxv;    591.05ms &boxv;     569.92ms 
&boxv;     no change &boxv;</span>
+<span class="err">&boxv; QQuery 13    &boxv;    861.06ms &boxv;     848.59ms 
&boxv;     no change &boxv;</span>
+<span class="err">&boxv; QQuery 14    &boxv;    596.20ms &boxv;     580.73ms 
&boxv;     no change &boxv;</span>
+<span class="err">&boxv; QQuery 15    &boxv;    554.96ms &boxv;     548.77ms 
&boxv;     no change &boxv;</span>
+<span class="err">&boxv; QQuery 16    &boxv;   1175.08ms &boxv;    1146.07ms 
&boxv;     no change &boxv;</span>
+<span class="err">&boxv; QQuery 17    &boxv;   1150.45ms &boxv;    1121.49ms 
&boxv;     no change &boxv;</span>
+<span class="err">&boxv; QQuery 18    &boxv;   2634.75ms &boxv;    2494.07ms 
&boxv; +1.06x faster &boxv;</span>
+<span class="err">&boxv; QQuery 19    &boxv;     90.15ms &boxv;      89.24ms 
&boxv;     no change &boxv;</span>
+<span class="err">&boxv; QQuery 20    &boxv;    620.15ms &boxv;     591.67ms 
&boxv;     no change &boxv;</span>
+<span class="err">&boxv; QQuery 21    &boxv;    782.38ms &boxv;     703.15ms 
&boxv; +1.11x faster &boxv;</span>
+<span class="err">&boxv; QQuery 22    &boxv;   1927.94ms &boxv;    1404.35ms 
&boxv; +1.37x faster &boxv;</span>
+<span class="err">&boxv; QQuery 23    &boxv;   8104.11ms &boxv;    3610.76ms 
&boxv; +2.24x faster &boxv;</span>
+<span class="err">&boxv; QQuery 24    &boxv;    360.79ms &boxv;     330.55ms 
&boxv; +1.09x faster &boxv;</span>
+<span class="err">&boxv; QQuery 25    &boxv;    290.61ms &boxv;     252.54ms 
&boxv; +1.15x faster &boxv;</span>
+<span class="err">&boxv; QQuery 26    &boxv;    395.18ms &boxv;     362.72ms 
&boxv; +1.09x faster &boxv;</span>
+<span class="err">&boxv; QQuery 27    &boxv;    891.76ms &boxv;     959.39ms 
&boxv;  1.08x slower &boxv;</span>
+<span class="err">&boxv; QQuery 28    &boxv;   4059.54ms &boxv;    4137.37ms 
&boxv;     no change &boxv;</span>
+<span class="err">&boxv; QQuery 29    &boxv;    235.88ms &boxv;     228.99ms 
&boxv;     no change &boxv;</span>
+<span class="err">&boxv; QQuery 30    &boxv;    564.22ms &boxv;     584.65ms 
&boxv;     no change &boxv;</span>
+<span class="err">&boxv; QQuery 31    &boxv;    741.20ms &boxv;     757.87ms 
&boxv;     no change &boxv;</span>
+<span class="err">&boxv; QQuery 32    &boxv;   2652.48ms &boxv;    2574.19ms 
&boxv;     no change &boxv;</span>
+<span class="err">&boxv; QQuery 33    &boxv;   2373.71ms &boxv;    2327.10ms 
&boxv;     no change &boxv;</span>
+<span class="err">&boxv; QQuery 34    &boxv;   2391.00ms &boxv;    2342.15ms 
&boxv;     no change &boxv;</span>
+<span class="err">&boxv; QQuery 35    &boxv;    700.79ms &boxv;     694.51ms 
&boxv;     no change &boxv;</span>
+<span class="err">&boxv; QQuery 36    &boxv;    151.51ms &boxv;     152.93ms 
&boxv;     no change &boxv;</span>
+<span class="err">&boxv; QQuery 37    &boxv;    108.18ms &boxv;      86.03ms 
&boxv; +1.26x faster &boxv;</span>
+<span class="err">&boxv; QQuery 38    &boxv;    114.64ms &boxv;     106.22ms 
&boxv; +1.08x faster &boxv;</span>
+<span class="err">&boxv; QQuery 39    &boxv;    260.80ms &boxv;     239.13ms 
&boxv; +1.09x faster &boxv;</span>
+<span class="err">&boxv; QQuery 40    &boxv;     60.74ms &boxv;      73.29ms 
&boxv;  1.21x slower &boxv;</span>
+<span class="err">&boxv; QQuery 41    &boxv;     58.75ms &boxv;      67.85ms 
&boxv;  1.15x slower &boxv;</span>
+<span class="err">&boxv; QQuery 42    &boxv;     65.49ms &boxv;      68.11ms 
&boxv;     no change &boxv;</span>
+<span 
class="err">&boxur;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxhu;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxhu;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxhu;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxul;</span>
+<span class="err">┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓</span>
+<span class="err">┃ Benchmark Summary           ┃            ┃</span>
+<span class="err">┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩</span>
+<span class="err">&boxv; Total Time (no-pushdown)    &boxv; 38344.79ms 
&boxv;</span>
+<span class="err">&boxv; Total Time (new-pushdown)   &boxv; 32800.50ms 
&boxv;</span>
+<span class="err">&boxv; Average Time (no-pushdown)  &boxv;   891.74ms 
&boxv;</span>
+<span class="err">&boxv; Average Time (new-pushdown) &boxv;   762.80ms 
&boxv;</span>
+<span class="err">&boxv; Queries Faster              &boxv;         13 
&boxv;</span>
+<span class="err">&boxv; Queries Slower              &boxv;          5 
&boxv;</span>
+<span class="err">&boxv; Queries with No Change      &boxv;         25 
&boxv;</span>
+<span 
class="err">&boxur;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxhu;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxul;</span>
+</code></pre></div>
+<h2>Conclusion</h2>
+<p>Despite being simple in theory, filter pushdown in Parquet is non-trivial 
to implement.
+It requires understanding both the Parquet format and reader implementation 
details. 
+The challenge lies in efficiently navigating through the dynamics of decoding, 
filter evaluation, and memory management.</p>
+<p>If you are interested in this level of optimization and want to help test, 
document and implement this type of optimization, come find us in the <a 
href="https://datafusion.apache.org/contributor-guide/communication.html";>DataFusion
 Community</a>. We would love to have you. </p>
+        </div>
+      </div>
+    </div>    
+    <!-- footer -->
+    <div class="row">
+      <div class="large-12 medium-12 columns">
+        <p style="font-style: italic; font-size: 0.8rem; text-align: center;">
+          Copyright 2025, <a href="https://www.apache.org/";>The Apache 
Software Foundation</a>, Licensed under the <a 
href="https://www.apache.org/licenses/LICENSE-2.0";>Apache License, Version 
2.0</a>.<br/>
+          Apache&reg; and the Apache feather logo are trademarks of The Apache 
Software Foundation.
+        </p>
+      </div>
+    </div>
+    <script src="/blog/js/bootstrap.bundle.min.js"></script>  </main>
+  </body>
+</html>
diff --git a/output/author/xiangpeng-hao.html b/output/author/xiangpeng-hao.html
index f2303b8..20b471a 100644
--- a/output/author/xiangpeng-hao.html
+++ b/output/author/xiangpeng-hao.html
@@ -47,6 +47,63 @@
             <p><i>Here you can find the latest updates from DataFusion and 
related projects.</i></p>
 
 
+    <!-- Post -->
+    <div class="row">
+        <div class="callout">
+            <article class="post">
+                <header>
+                    <div class="title">
+                        <h1><a 
href="/blog/2025/03/21/parquet-pushdown">Efficient Filter Pushdown in 
Parquet</a></h1>
+                        <p>Posted on: Fri 21 March 2025 by Xiangpeng Hao</p>
+                        <p><style>
+figure {
+  margin: 20px 0;
+}
+
+figure img {
+  display: block;
+  max-width: 80%;
+}
+
+figcaption {
+  font-style: italic;
+  margin-top: 10px;
+  color: #555;
+  font-size: 0.9em;
+  max-width: 80%;
+}
+</style>
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+<p><em>Editor's Note: This blog was first published on <a 
href="https://blog.xiangpeng.systems/posts/parquet-pushdown/";>Xiangpeng Hao's 
blog</a>. Thanks to <a href="https://www.influxdata.com/";>InfluxData</a> for 
sponsoring this work as part of his PhD funding.</em></p>
+<hr/>
+<p>In the <a 
href="https://datafusion.apache.org/blog/2025/03/20/parquet-pruning";>previous 
post …</a></p></p>
+                        <footer>
+                            <ul class="actions">
+                                <div style="text-align: right"><a 
href="/blog/2025/03/21/parquet-pushdown" class="button medium">Continue 
Reading</a></div>
+                            </ul>
+                            <ul class="stats">
+                            </ul>
+                        </footer>
+            </article>
+        </div>
+    </div>
     <!-- Post -->
     <div class="row">
         <div class="callout">
diff --git a/output/category/blog.html b/output/category/blog.html
index 56f617b..33393cc 100644
--- a/output/category/blog.html
+++ b/output/category/blog.html
@@ -47,6 +47,63 @@
             <p><i>Here you can find the latest updates from DataFusion and 
related projects.</i></p>
 
 
+    <!-- Post -->
+    <div class="row">
+        <div class="callout">
+            <article class="post">
+                <header>
+                    <div class="title">
+                        <h1><a 
href="/blog/2025/03/21/parquet-pushdown">Efficient Filter Pushdown in 
Parquet</a></h1>
+                        <p>Posted on: Fri 21 March 2025 by Xiangpeng Hao</p>
+                        <p><style>
+figure {
+  margin: 20px 0;
+}
+
+figure img {
+  display: block;
+  max-width: 80%;
+}
+
+figcaption {
+  font-style: italic;
+  margin-top: 10px;
+  color: #555;
+  font-size: 0.9em;
+  max-width: 80%;
+}
+</style>
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+<p><em>Editor's Note: This blog was first published on <a 
href="https://blog.xiangpeng.systems/posts/parquet-pushdown/";>Xiangpeng Hao's 
blog</a>. Thanks to <a href="https://www.influxdata.com/";>InfluxData</a> for 
sponsoring this work as part of his PhD funding.</em></p>
+<hr/>
+<p>In the <a 
href="https://datafusion.apache.org/blog/2025/03/20/parquet-pruning";>previous 
post …</a></p></p>
+                        <footer>
+                            <ul class="actions">
+                                <div style="text-align: right"><a 
href="/blog/2025/03/21/parquet-pushdown" class="button medium">Continue 
Reading</a></div>
+                            </ul>
+                            <ul class="stats">
+                            </ul>
+                        </footer>
+            </article>
+        </div>
+    </div>
     <!-- Post -->
     <div class="row">
         <div class="callout">
diff --git a/output/feed.xml b/output/feed.xml
index fccf433..fcdbe66 100644
--- a/output/feed.xml
+++ b/output/feed.xml
@@ -1,5 +1,43 @@
 <?xml version="1.0" encoding="utf-8"?>
-<rss version="2.0"><channel><title>Apache DataFusion 
Blog</title><link>https://datafusion.apache.org/blog/</link><description></description><lastBuildDate>Thu,
 20 Mar 2025 00:00:00 +0000</lastBuildDate><item><title>Apache DataFusion Comet 
0.7.0 
Release</title><link>https://datafusion.apache.org/blog/2025/03/20/datafusion-comet-0.7.0</link><description>&lt;!--
+<rss version="2.0"><channel><title>Apache DataFusion 
Blog</title><link>https://datafusion.apache.org/blog/</link><description></description><lastBuildDate>Fri,
 21 Mar 2025 00:00:00 +0000</lastBuildDate><item><title>Efficient Filter 
Pushdown in 
Parquet</title><link>https://datafusion.apache.org/blog/2025/03/21/parquet-pushdown</link><description>&lt;style&gt;
+figure {
+  margin: 20px 0;
+}
+
+figure img {
+  display: block;
+  max-width: 80%;
+}
+
+figcaption {
+  font-style: italic;
+  margin-top: 10px;
+  color: #555;
+  font-size: 0.9em;
+  max-width: 80%;
+}
+&lt;/style&gt;
+&lt;!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+--&gt;
+&lt;p&gt;&lt;em&gt;Editor's Note: This blog was first published on &lt;a 
href="https://blog.xiangpeng.systems/posts/parquet-pushdown/"&gt;Xiangpeng 
Hao's blog&lt;/a&gt;. Thanks to &lt;a 
href="https://www.influxdata.com/"&gt;InfluxData&lt;/a&gt; for sponsoring this 
work as part of his PhD funding.&lt;/em&gt;&lt;/p&gt;
+&lt;hr/&gt;
+&lt;p&gt;In the &lt;a 
href="https://datafusion.apache.org/blog/2025/03/20/parquet-pruning"&gt;previous
 post …&lt;/a&gt;&lt;/p&gt;</description><dc:creator 
xmlns:dc="http://purl.org/dc/elements/1.1/";>Xiangpeng 
Hao</dc:creator><pubDate>Fri, 21 Mar 2025 00:00:00 +0000</pubDate><guid 
isPermaLink="false">tag:datafusion.apache.org,2025-03-21:/blog/2025/03/21/parquet-pushdown</guid><category>blog</category></item><item><title>Apache
 DataFusion Comet 0.7.0 Release</title><link>https://datafusion [...]
 {% comment %}
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
diff --git a/output/feeds/all-en.atom.xml b/output/feeds/all-en.atom.xml
index 2dc033d..450d535 100644
--- a/output/feeds/all-en.atom.xml
+++ b/output/feeds/all-en.atom.xml
@@ -1,5 +1,293 @@
 <?xml version="1.0" encoding="utf-8"?>
-<feed xmlns="http://www.w3.org/2005/Atom";><title>Apache DataFusion 
Blog</title><link href="https://datafusion.apache.org/blog/"; 
rel="alternate"></link><link 
href="https://datafusion.apache.org/blog/feeds/all-en.atom.xml"; 
rel="self"></link><id>https://datafusion.apache.org/blog/</id><updated>2025-03-20T00:00:00+00:00</updated><subtitle></subtitle><entry><title>Apache
 DataFusion Comet 0.7.0 Release</title><link 
href="https://datafusion.apache.org/blog/2025/03/20/datafusion-comet-0.7.0"; rel 
[...]
+<feed xmlns="http://www.w3.org/2005/Atom";><title>Apache DataFusion 
Blog</title><link href="https://datafusion.apache.org/blog/"; 
rel="alternate"></link><link 
href="https://datafusion.apache.org/blog/feeds/all-en.atom.xml"; 
rel="self"></link><id>https://datafusion.apache.org/blog/</id><updated>2025-03-21T00:00:00+00:00</updated><subtitle></subtitle><entry><title>Efficient
 Filter Pushdown in Parquet</title><link 
href="https://datafusion.apache.org/blog/2025/03/21/parquet-pushdown"; 
rel="alter [...]
+figure {
+  margin: 20px 0;
+}
+
+figure img {
+  display: block;
+  max-width: 80%;
+}
+
+figcaption {
+  font-style: italic;
+  margin-top: 10px;
+  color: #555;
+  font-size: 0.9em;
+  max-width: 80%;
+}
+&lt;/style&gt;
+&lt;!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+--&gt;
+&lt;p&gt;&lt;em&gt;Editor's Note: This blog was first published on &lt;a 
href="https://blog.xiangpeng.systems/posts/parquet-pushdown/"&gt;Xiangpeng 
Hao's blog&lt;/a&gt;. Thanks to &lt;a 
href="https://www.influxdata.com/"&gt;InfluxData&lt;/a&gt; for sponsoring this 
work as part of his PhD funding.&lt;/em&gt;&lt;/p&gt;
+&lt;hr/&gt;
+&lt;p&gt;In the &lt;a 
href="https://datafusion.apache.org/blog/2025/03/20/parquet-pruning"&gt;previous
 post …&lt;/a&gt;&lt;/p&gt;</summary><content type="html">&lt;style&gt;
+figure {
+  margin: 20px 0;
+}
+
+figure img {
+  display: block;
+  max-width: 80%;
+}
+
+figcaption {
+  font-style: italic;
+  margin-top: 10px;
+  color: #555;
+  font-size: 0.9em;
+  max-width: 80%;
+}
+&lt;/style&gt;
+&lt;!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+--&gt;
+&lt;p&gt;&lt;em&gt;Editor's Note: This blog was first published on &lt;a 
href="https://blog.xiangpeng.systems/posts/parquet-pushdown/"&gt;Xiangpeng 
Hao's blog&lt;/a&gt;. Thanks to &lt;a 
href="https://www.influxdata.com/"&gt;InfluxData&lt;/a&gt; for sponsoring this 
work as part of his PhD funding.&lt;/em&gt;&lt;/p&gt;
+&lt;hr/&gt;
+&lt;p&gt;In the &lt;a 
href="https://datafusion.apache.org/blog/2025/03/20/parquet-pruning"&gt;previous
 post&lt;/a&gt;, we discussed how &lt;a 
href="https://datafusion.apache.org/"&gt;Apache DataFusion&lt;/a&gt; prunes 
&lt;a href="https://parquet.apache.org/"&gt;Apache Parquet&lt;/a&gt; files to 
skip irrelevant &lt;strong&gt;files/row_groups&lt;/strong&gt; (sometimes also 
&lt;a 
href="https://parquet.apache.org/docs/file-format/pageindex/"&gt;pages&lt;/a&gt;).&lt;/p&gt;
+&lt;p&gt;This post discusses how Parquet readers skip irrelevant 
&lt;strong&gt;rows&lt;/strong&gt; while scanning data,
+leveraging Parquet's columnar layout by first reading only filter columns,
+and then selectively reading other columns only for matching rows.&lt;/p&gt;
+&lt;h2&gt;Why filter pushdown in Parquet?&lt;/h2&gt;
+&lt;p&gt;Below is an example query that reads sensor data with filters on 
&lt;code&gt;date_time&lt;/code&gt; and &lt;code&gt;location&lt;/code&gt;.
+Without filter pushdown, all rows from &lt;code&gt;location&lt;/code&gt;, 
&lt;code&gt;val&lt;/code&gt;, and &lt;code&gt;date_time&lt;/code&gt; columns 
are decoded before &lt;code&gt;location='office'&lt;/code&gt; is evaluated. 
Filter pushdown is especially useful when the filter is selective, i.e., 
removes many rows.&lt;/p&gt;
+&lt;div 
class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span 
class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;&lt;span 
class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;location&lt;/span&gt; 
+&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span 
class="n"&gt;sensor_data&lt;/span&gt; 
+&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span 
class="n"&gt;date_time&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; 
&lt;span class="s1"&gt;'2025-03-11'&lt;/span&gt; &lt;span 
class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;location&lt;/span&gt; 
&lt;span class="o"&gt;=&lt;/span&gt; &lt;span 
class="s1"&gt;'office'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
+&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+&lt;figure&gt;
+&lt;img alt="Parquet pruning skips irrelevant files/row_groups, while filter 
pushdown skips irrelevant rows. Without filter pushdown, all rows from 
location, val, and date_time columns are decoded before `location='office'` is 
evaluated. Filter pushdown is especially useful when the filter is selective, 
i.e., removes many rows." class="img-responsive" 
src="/blog/images/parquet-pushdown/pushdown-vs-no-pushdown.jpg" width="80%"/&gt;
+&lt;figcaption&gt;
+    Parquet pruning skips irrelevant files/row_groups, while filter pushdown 
skips irrelevant rows. Without filter pushdown, all rows from location, val, 
and date_time columns are decoded before `location='office'` is evaluated. 
Filter pushdown is especially useful when the filter is selective, i.e., 
removes many rows.
+  &lt;/figcaption&gt;
+&lt;/figure&gt;
+&lt;p&gt;In our setup, sensor data is aggregated by date &amp;mdash; each day 
has its own Parquet file.
+At planning time, DataFusion prunes the unneeded Parquet files, i.e., 
&lt;code&gt;2025-03-10.parquet&lt;/code&gt; and 
&lt;code&gt;2025-03-11.parquet&lt;/code&gt;.&lt;/p&gt;
+&lt;p&gt;Once the files to read are located, the &lt;a 
href="https://github.com/apache/datafusion/issues/3463"&gt;&lt;em&gt;DataFusion's
 current default implementation&lt;/em&gt;&lt;/a&gt; reads all the projected 
columns (&lt;code&gt;sensor_id&lt;/code&gt;, &lt;code&gt;val&lt;/code&gt;, and 
&lt;code&gt;location&lt;/code&gt;) into Arrow RecordBatches, then applies the 
filters over &lt;code&gt;location&lt;/code&gt; to get the final set of 
rows.&lt;/p&gt;
+&lt;p&gt;A better approach is called &lt;strong&gt;filter 
pushdown&lt;/strong&gt; with &lt;strong&gt;late materialization&lt;/strong&gt;, 
which evaluates filter conditions first and only decodes data that passes these 
conditions.
+In practice, this works by first processing only the filter columns 
(&lt;code&gt;date_time&lt;/code&gt; and &lt;code&gt;location&lt;/code&gt;), 
building a boolean mask of rows that satisfy our conditions, then using this 
mask to selectively decode only the relevant rows from other columns 
(&lt;code&gt;sensor_id&lt;/code&gt;, &lt;code&gt;val&lt;/code&gt;). 
+This eliminates the waste of decoding rows that will be immediately filtered 
out.&lt;/p&gt;
+&lt;p&gt;While simple in theory, practical implementations often make 
performance worse.&lt;/p&gt;
+&lt;h2&gt;How can filter pushdown be slower?&lt;/h2&gt;
+&lt;p&gt;At a high level, the Parquet reader first builds a filter mask -- 
essentially a boolean array indicating which rows meet the filter criteria -- 
and then uses this mask to selectively decode only the needed rows from the 
remaining columns in the projection.&lt;/p&gt;
+&lt;p&gt;Let's dig into details of &lt;a 
href="https://github.com/apache/arrow-rs/blob/d5339f31a60a4bd8a4256e7120fe32603249d88e/parquet/src/arrow/async_reader/mod.rs#L618-L712"&gt;how
 filter pushdown is implemented&lt;/a&gt; in the current Rust Parquet reader 
implementation, illustrated in the following figure.&lt;/p&gt;
+&lt;figure&gt;
+&lt;img alt="Implementation of filter pushdown in Rust Parquet readers" 
class="img-responsive" src="/blog/images/parquet-pushdown/baseline-impl.jpg" 
with="70%"/&gt;
+&lt;figcaption&gt;
+    Implementation of filter pushdown in Rust Parquet readers -- the first 
phase builds the filter mask, the second phase applies the filter mask to the 
other columns
+  &lt;/figcaption&gt;
+&lt;/figure&gt;
+&lt;p&gt;The filter pushdown has two phases:&lt;/p&gt;
+&lt;ol&gt;
+&lt;li&gt;
+&lt;p&gt;Build the filter mask (steps 1-3)&lt;/p&gt;
+&lt;/li&gt;
+&lt;li&gt;
+&lt;p&gt;Use the filter mask to selectively decode other columns (steps 4-7), 
e.g., output step 3 is used as input for step 5 and 7.&lt;/p&gt;
+&lt;/li&gt;
+&lt;/ol&gt;
+&lt;p&gt;Within each phase, it takes three steps from Parquet to 
Arrow:&lt;/p&gt;
+&lt;ol&gt;
+&lt;li&gt;
+&lt;p&gt;Decompress the Parquet pages using generic decompression algorithms 
like LZ4, Zstd, etc. (steps 1, 4, 6)&lt;/p&gt;
+&lt;/li&gt;
+&lt;li&gt;
+&lt;p&gt;Decode the page content into Arrow format (steps 2, 5, 7)&lt;/p&gt;
+&lt;/li&gt;
+&lt;li&gt;
+&lt;p&gt;Evaluate the filter over Arrow data (step 3)&lt;/p&gt;
+&lt;/li&gt;
+&lt;/ol&gt;
+&lt;p&gt;In the figure above, we can see that 
&lt;code&gt;location&lt;/code&gt; is &lt;strong&gt;decompressed and decoded 
twice&lt;/strong&gt;, first when building the filter mask (steps 1, 2), and 
second when building the output (steps 4, 5).
+This happens for all columns that appear both in the filter and 
output.&lt;/p&gt;
+&lt;p&gt;The table below shows the corresponding CPU time on the &lt;a 
href="https://github.com/apache/datafusion/blob/main/benchmarks/queries/clickbench/queries.sql#L23"&gt;ClickBench
 query 22&lt;/a&gt;:&lt;/p&gt;
+&lt;div 
class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span 
class="err"&gt;+------------+--------+-------------+--------+&lt;/span&gt;
+&lt;span class="err"&gt;| Decompress | Decode | Apply filter| Others 
|&lt;/span&gt;
+&lt;span 
class="err"&gt;+------------+--------+-------------+--------+&lt;/span&gt;
+&lt;span class="err"&gt;| 206 ms     | 117 ms | 22 ms       | 48 ms  
|&lt;/span&gt;
+&lt;span 
class="err"&gt;+------------+--------+-------------+--------+&lt;/span&gt;
+&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+&lt;p&gt;Clearly, decompress/decode operations dominate the time spent. With 
filter pushdown, it needs to decompress/decode twice; but without filter 
pushdown, it only needs to do this once.
+This explains why filter pushdown is slower in some cases.&lt;/p&gt;
+&lt;blockquote&gt;
+&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Highly selective filters may skip 
the entire page; but as long as it reads one row from the page, it needs to 
decompress and often decode the entire page.&lt;/p&gt;
+&lt;/blockquote&gt;
+&lt;h2&gt;Attempt: cache filter columns&lt;/h2&gt;
+&lt;p&gt;Intuitively, caching the filter columns and reusing them later could 
help.&lt;/p&gt;
+&lt;p&gt;But naively caching decoded pages consumes prohibitively high 
memory:&lt;/p&gt;
+&lt;ol&gt;
+&lt;li&gt;
+&lt;p&gt;It needs to cache Arrow arrays, which are on average &lt;a 
href="https://github.com/XiangpengHao/liquid-cache/blob/main/dev/doc/liquid-cache-vldb.pdf"&gt;4x
 larger than Parquet data&lt;/a&gt;.&lt;/p&gt;
+&lt;/li&gt;
+&lt;li&gt;
+&lt;p&gt;It needs to cache the &lt;strong&gt;entire column chunk in 
memory&lt;/strong&gt;, because in Phase 1 it builds filters over the column 
chunk, and only use it in Phase 2.  &lt;/p&gt;
+&lt;/li&gt;
+&lt;li&gt;
+&lt;p&gt;The memory usage is proportional to the number of filter columns, 
which can be unboundedly high. &lt;/p&gt;
+&lt;/li&gt;
+&lt;/ol&gt;
+&lt;p&gt;Worse, caching filter columns means it needs to read partially from 
Parquet and partially from cache, which is complex to implement, likely 
requiring a substantial change to the current implementation. &lt;/p&gt;
+&lt;blockquote&gt;
+&lt;p&gt;&lt;strong&gt;Feel the complexity:&lt;/strong&gt; consider building a 
cache that properly handles nested columns, multiple filters, and filters with 
multiple columns.&lt;/p&gt;
+&lt;/blockquote&gt;
+&lt;h2&gt;Real solution&lt;/h2&gt;
+&lt;p&gt;We need a solution that:&lt;/p&gt;
+&lt;ol&gt;
+&lt;li&gt;
+&lt;p&gt;Is simple to implement, i.e., doesn't require thousands of lines of 
code.&lt;/p&gt;
+&lt;/li&gt;
+&lt;li&gt;
+&lt;p&gt;Incurs minimal memory overhead.&lt;/p&gt;
+&lt;/li&gt;
+&lt;/ol&gt;
+&lt;p&gt;This section describes my &lt;a 
href="https://github.com/apache/arrow-rs/pull/6921#issuecomment-2718792433"&gt;&amp;lt;700
 LOC PR (with lots of comments and tests)&lt;/a&gt; that &lt;strong&gt;reduces 
total ClickBench time by 15%, with up to 2x lower latency for some queries, no 
obvious regression on other queries, and caches at most 2 pages (~2MB) per 
column in memory&lt;/strong&gt;.&lt;/p&gt;
+&lt;figure&gt;
+&lt;img alt="New decoding pipeline, building filter mask and output columns 
are interleaved in a single pass, allowing us to cache minimal pages for 
minimal amount of time" class="img-responsive" 
src="/blog/images/parquet-pushdown/new-pipeline.jpg" width="80%"/&gt;
+&lt;figcaption&gt;
+    New decoding pipeline, building filter mask and output columns are 
interleaved in a single pass, allowing us to cache minimal pages for minimal 
amount of time
+  &lt;/figcaption&gt;
+&lt;/figure&gt;
+&lt;p&gt;The new pipeline interleaves the previous two phases into a single 
pass, so that:&lt;/p&gt;
+&lt;ol&gt;
+&lt;li&gt;
+&lt;p&gt;The page being decompressed is immediately used to build filter masks 
and output columns.&lt;/p&gt;
+&lt;/li&gt;
+&lt;li&gt;
+&lt;p&gt;Decompressed pages are cached for minimal time; after one pass (steps 
1-6), the cache memory is released for the next pass. &lt;/p&gt;
+&lt;/li&gt;
+&lt;/ol&gt;
+&lt;p&gt;This allows the cache to only hold 1 page at a time, and to 
immediately discard the previous page after it's used, significantly reducing 
the memory requirement for caching.&lt;/p&gt;
+&lt;h3&gt;What pages are cached?&lt;/h3&gt;
+&lt;p&gt;You may have noticed that only &lt;code&gt;location&lt;/code&gt; is 
cached, not &lt;code&gt;val&lt;/code&gt;, because &lt;code&gt;val&lt;/code&gt; 
is only used for output.
+More generally, only columns that appear both in the filter and output are 
cached, and at most 1 page is cached for each such column.&lt;/p&gt;
+&lt;p&gt;More examples:&lt;/p&gt;
+&lt;div 
class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span 
class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt; 
+&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span 
class="n"&gt;sensor_data&lt;/span&gt; 
+&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span 
class="n"&gt;date_time&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; 
&lt;span class="s1"&gt;'2025-03-11'&lt;/span&gt; &lt;span 
class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;location&lt;/span&gt; 
&lt;span class="o"&gt;=&lt;/span&gt; &lt;span 
class="s1"&gt;'office'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
+&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+&lt;p&gt;In this case, no columns are cached, because 
&lt;code&gt;val&lt;/code&gt; is not used for filtering.&lt;/p&gt;
+&lt;div 
class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span 
class="k"&gt;SELECT&lt;/span&gt; &lt;span 
class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span 
class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
+&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span 
class="n"&gt;sensor_data&lt;/span&gt; 
+&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span 
class="n"&gt;date_time&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; 
&lt;span class="s1"&gt;'2025-03-11'&lt;/span&gt; &lt;span 
class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;location&lt;/span&gt; 
&lt;span class="o"&gt;=&lt;/span&gt; &lt;span 
class="s1"&gt;'office'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
+&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+&lt;p&gt;In this case, again, no columns are cached, because the output 
projection is empty after query plan optimization.&lt;/p&gt;
+&lt;h3&gt;Then why cache 2 pages per column instead of 1?&lt;/h3&gt;
+&lt;p&gt;This is another real-world nuance regarding how Parquet layouts the 
pages.&lt;/p&gt;
+&lt;p&gt;Parquet by default encodes data using &lt;a 
href="https://parquet.apache.org/docs/file-format/data-pages/encodings/"&gt;dictionary
 encoding&lt;/a&gt;, which writes a dictionary page as the first page of a 
column chunk, followed by the keys referencing the dictionary.&lt;/p&gt;
+&lt;p&gt;You can see this in action using &lt;a 
href="https://parquet-viewer.xiangpeng.systems"&gt;parquet-viewer&lt;/a&gt;:&lt;/p&gt;
+&lt;figure&gt;
+&lt;img alt="Parquet viewer shows the page layout of a column chunk" 
class="img-responsive" src="/blog/images/parquet-pushdown/parquet-viewer.jpg" 
width="80%"/&gt;
+&lt;figcaption&gt;
+    Parquet viewer shows the page layout of a column chunk
+  &lt;/figcaption&gt;
+&lt;/figure&gt;
+&lt;p&gt;This means that to decode a page of data, it actually references two 
pages: the dictionary page and the data page.&lt;/p&gt;
+&lt;p&gt;This is why it caches 2 pages per column: one dictionary page and one 
data page.
+The data page slot will move forward as it reads the data; but the dictionary 
page slot always references the first page.&lt;/p&gt;
+&lt;figure&gt;
+&lt;img alt="Cached two pages, one for dictionary (pinned), one for data 
(moves as it reads the data)" class="img-responsive" 
src="/blog/images/parquet-pushdown/cached-pages.jpg" width="80%"/&gt;
+&lt;figcaption&gt;
+    Cached two pages, one for dictionary (pinned), one for data (moves as it 
reads the data)
+  &lt;/figcaption&gt;
+&lt;/figure&gt;
+&lt;h2&gt;How does it perform?&lt;/h2&gt;
+&lt;p&gt;Here are my results on &lt;a 
href="https://github.com/apache/datafusion/tree/main/benchmarks#clickbench"&gt;ClickBench&lt;/a&gt;
 on my AMD 9900X machine. The total time is reduced by 15%, with Q23 being 
2.24x faster,
+and queries that get slower are likely due to noise.&lt;/p&gt;
+&lt;div 
class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span 
class="err"&gt;┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓&lt;/span&gt;
+&lt;span class="err"&gt;┃ Query        ┃ no-pushdown ┃ new-pushdown ┃        
Change ┃&lt;/span&gt;
+&lt;span 
class="err"&gt;┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 0     &amp;boxv;      0.47ms 
&amp;boxv;       0.43ms &amp;boxv; +1.10x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 1     &amp;boxv;     51.10ms 
&amp;boxv;      50.10ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 2     &amp;boxv;     68.23ms 
&amp;boxv;      64.49ms &amp;boxv; +1.06x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 3     &amp;boxv;     90.68ms 
&amp;boxv;      86.73ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 4     &amp;boxv;    458.93ms 
&amp;boxv;     458.59ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 5     &amp;boxv;    522.06ms 
&amp;boxv;     478.50ms &amp;boxv; +1.09x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 6     &amp;boxv;     49.84ms 
&amp;boxv;      49.94ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 7     &amp;boxv;     55.09ms 
&amp;boxv;      55.77ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 8     &amp;boxv;    565.26ms 
&amp;boxv;     556.95ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 9     &amp;boxv;    575.83ms 
&amp;boxv;     575.05ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 10    &amp;boxv;    164.56ms 
&amp;boxv;     178.23ms &amp;boxv;  1.08x slower &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 11    &amp;boxv;    177.20ms 
&amp;boxv;     191.32ms &amp;boxv;  1.08x slower &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 12    &amp;boxv;    591.05ms 
&amp;boxv;     569.92ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 13    &amp;boxv;    861.06ms 
&amp;boxv;     848.59ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 14    &amp;boxv;    596.20ms 
&amp;boxv;     580.73ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 15    &amp;boxv;    554.96ms 
&amp;boxv;     548.77ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 16    &amp;boxv;   1175.08ms 
&amp;boxv;    1146.07ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 17    &amp;boxv;   1150.45ms 
&amp;boxv;    1121.49ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 18    &amp;boxv;   2634.75ms 
&amp;boxv;    2494.07ms &amp;boxv; +1.06x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 19    &amp;boxv;     90.15ms 
&amp;boxv;      89.24ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 20    &amp;boxv;    620.15ms 
&amp;boxv;     591.67ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 21    &amp;boxv;    782.38ms 
&amp;boxv;     703.15ms &amp;boxv; +1.11x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 22    &amp;boxv;   1927.94ms 
&amp;boxv;    1404.35ms &amp;boxv; +1.37x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 23    &amp;boxv;   8104.11ms 
&amp;boxv;    3610.76ms &amp;boxv; +2.24x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 24    &amp;boxv;    360.79ms 
&amp;boxv;     330.55ms &amp;boxv; +1.09x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 25    &amp;boxv;    290.61ms 
&amp;boxv;     252.54ms &amp;boxv; +1.15x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 26    &amp;boxv;    395.18ms 
&amp;boxv;     362.72ms &amp;boxv; +1.09x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 27    &amp;boxv;    891.76ms 
&amp;boxv;     959.39ms &amp;boxv;  1.08x slower &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 28    &amp;boxv;   4059.54ms 
&amp;boxv;    4137.37ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 29    &amp;boxv;    235.88ms 
&amp;boxv;     228.99ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 30    &amp;boxv;    564.22ms 
&amp;boxv;     584.65ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 31    &amp;boxv;    741.20ms 
&amp;boxv;     757.87ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 32    &amp;boxv;   2652.48ms 
&amp;boxv;    2574.19ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 33    &amp;boxv;   2373.71ms 
&amp;boxv;    2327.10ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 34    &amp;boxv;   2391.00ms 
&amp;boxv;    2342.15ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 35    &amp;boxv;    700.79ms 
&amp;boxv;     694.51ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 36    &amp;boxv;    151.51ms 
&amp;boxv;     152.93ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 37    &amp;boxv;    108.18ms 
&amp;boxv;      86.03ms &amp;boxv; +1.26x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 38    &amp;boxv;    114.64ms 
&amp;boxv;     106.22ms &amp;boxv; +1.08x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 39    &amp;boxv;    260.80ms 
&amp;boxv;     239.13ms &amp;boxv; +1.09x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 40    &amp;boxv;     60.74ms 
&amp;boxv;      73.29ms &amp;boxv;  1.21x slower &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 41    &amp;boxv;     58.75ms 
&amp;boxv;      67.85ms &amp;boxv;  1.15x slower &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 42    &amp;boxv;     65.49ms 
&amp;boxv;      68.11ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span 
class="err"&gt;&amp;boxur;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxhu;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxhu;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxhu;&amp;boxh;&amp
 [...]
+&lt;span 
class="err"&gt;┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓&lt;/span&gt;
+&lt;span class="err"&gt;┃ Benchmark Summary           ┃            
┃&lt;/span&gt;
+&lt;span 
class="err"&gt;┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; Total Time (no-pushdown)    &amp;boxv; 
38344.79ms &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; Total Time (new-pushdown)   &amp;boxv; 
32800.50ms &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; Average Time (no-pushdown)  &amp;boxv;   
891.74ms &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; Average Time (new-pushdown) &amp;boxv;   
762.80ms &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; Queries Faster              &amp;boxv;      
   13 &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; Queries Slower              &amp;boxv;      
    5 &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; Queries with No Change      &amp;boxv;      
   25 &amp;boxv;&lt;/span&gt;
+&lt;span 
class="err"&gt;&amp;boxur;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxhu;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxul;&lt;/span&gt;
+&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+&lt;h2&gt;Conclusion&lt;/h2&gt;
+&lt;p&gt;Despite being simple in theory, filter pushdown in Parquet is 
non-trivial to implement.
+It requires understanding both the Parquet format and reader implementation 
details. 
+The challenge lies in efficiently navigating through the dynamics of decoding, 
filter evaluation, and memory management.&lt;/p&gt;
+&lt;p&gt;If you are interested in this level of optimization and want to help 
test, document and implement this type of optimization, come find us in the 
&lt;a 
href="https://datafusion.apache.org/contributor-guide/communication.html"&gt;DataFusion
 Community&lt;/a&gt;. We would love to have you. &lt;/p&gt;</content><category 
term="blog"></category></entry><entry><title>Apache DataFusion Comet 0.7.0 
Release</title><link 
href="https://datafusion.apache.org/blog/2025/03/20/datafusion-comet-0 [...]
 {% comment %}
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
diff --git a/output/feeds/blog.atom.xml b/output/feeds/blog.atom.xml
index 5de58a8..dbe61aa 100644
--- a/output/feeds/blog.atom.xml
+++ b/output/feeds/blog.atom.xml
@@ -1,5 +1,293 @@
 <?xml version="1.0" encoding="utf-8"?>
-<feed xmlns="http://www.w3.org/2005/Atom";><title>Apache DataFusion Blog - 
blog</title><link href="https://datafusion.apache.org/blog/"; 
rel="alternate"></link><link 
href="https://datafusion.apache.org/blog/feeds/blog.atom.xml"; 
rel="self"></link><id>https://datafusion.apache.org/blog/</id><updated>2025-03-20T00:00:00+00:00</updated><subtitle></subtitle><entry><title>Apache
 DataFusion Comet 0.7.0 Release</title><link 
href="https://datafusion.apache.org/blog/2025/03/20/datafusion-comet-0.7.0 [...]
+<feed xmlns="http://www.w3.org/2005/Atom";><title>Apache DataFusion Blog - 
blog</title><link href="https://datafusion.apache.org/blog/"; 
rel="alternate"></link><link 
href="https://datafusion.apache.org/blog/feeds/blog.atom.xml"; 
rel="self"></link><id>https://datafusion.apache.org/blog/</id><updated>2025-03-21T00:00:00+00:00</updated><subtitle></subtitle><entry><title>Efficient
 Filter Pushdown in Parquet</title><link 
href="https://datafusion.apache.org/blog/2025/03/21/parquet-pushdown"; rel=" 
[...]
+figure {
+  margin: 20px 0;
+}
+
+figure img {
+  display: block;
+  max-width: 80%;
+}
+
+figcaption {
+  font-style: italic;
+  margin-top: 10px;
+  color: #555;
+  font-size: 0.9em;
+  max-width: 80%;
+}
+&lt;/style&gt;
+&lt;!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+--&gt;
+&lt;p&gt;&lt;em&gt;Editor's Note: This blog was first published on &lt;a 
href="https://blog.xiangpeng.systems/posts/parquet-pushdown/"&gt;Xiangpeng 
Hao's blog&lt;/a&gt;. Thanks to &lt;a 
href="https://www.influxdata.com/"&gt;InfluxData&lt;/a&gt; for sponsoring this 
work as part of his PhD funding.&lt;/em&gt;&lt;/p&gt;
+&lt;hr/&gt;
+&lt;p&gt;In the &lt;a 
href="https://datafusion.apache.org/blog/2025/03/20/parquet-pruning"&gt;previous
 post …&lt;/a&gt;&lt;/p&gt;</summary><content type="html">&lt;style&gt;
+figure {
+  margin: 20px 0;
+}
+
+figure img {
+  display: block;
+  max-width: 80%;
+}
+
+figcaption {
+  font-style: italic;
+  margin-top: 10px;
+  color: #555;
+  font-size: 0.9em;
+  max-width: 80%;
+}
+&lt;/style&gt;
+&lt;!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+--&gt;
+&lt;p&gt;&lt;em&gt;Editor's Note: This blog was first published on &lt;a 
href="https://blog.xiangpeng.systems/posts/parquet-pushdown/"&gt;Xiangpeng 
Hao's blog&lt;/a&gt;. Thanks to &lt;a 
href="https://www.influxdata.com/"&gt;InfluxData&lt;/a&gt; for sponsoring this 
work as part of his PhD funding.&lt;/em&gt;&lt;/p&gt;
+&lt;hr/&gt;
+&lt;p&gt;In the &lt;a 
href="https://datafusion.apache.org/blog/2025/03/20/parquet-pruning"&gt;previous
 post&lt;/a&gt;, we discussed how &lt;a 
href="https://datafusion.apache.org/"&gt;Apache DataFusion&lt;/a&gt; prunes 
&lt;a href="https://parquet.apache.org/"&gt;Apache Parquet&lt;/a&gt; files to 
skip irrelevant &lt;strong&gt;files/row_groups&lt;/strong&gt; (sometimes also 
&lt;a 
href="https://parquet.apache.org/docs/file-format/pageindex/"&gt;pages&lt;/a&gt;).&lt;/p&gt;
+&lt;p&gt;This post discusses how Parquet readers skip irrelevant 
&lt;strong&gt;rows&lt;/strong&gt; while scanning data,
+leveraging Parquet's columnar layout by first reading only filter columns,
+and then selectively reading other columns only for matching rows.&lt;/p&gt;
+&lt;h2&gt;Why filter pushdown in Parquet?&lt;/h2&gt;
+&lt;p&gt;Below is an example query that reads sensor data with filters on 
&lt;code&gt;date_time&lt;/code&gt; and &lt;code&gt;location&lt;/code&gt;.
+Without filter pushdown, all rows from &lt;code&gt;location&lt;/code&gt;, 
&lt;code&gt;val&lt;/code&gt;, and &lt;code&gt;date_time&lt;/code&gt; columns 
are decoded before &lt;code&gt;location='office'&lt;/code&gt; is evaluated. 
Filter pushdown is especially useful when the filter is selective, i.e., 
removes many rows.&lt;/p&gt;
+&lt;div 
class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span 
class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;&lt;span 
class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;location&lt;/span&gt; 
+&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span 
class="n"&gt;sensor_data&lt;/span&gt; 
+&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span 
class="n"&gt;date_time&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; 
&lt;span class="s1"&gt;'2025-03-11'&lt;/span&gt; &lt;span 
class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;location&lt;/span&gt; 
&lt;span class="o"&gt;=&lt;/span&gt; &lt;span 
class="s1"&gt;'office'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
+&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+&lt;figure&gt;
+&lt;img alt="Parquet pruning skips irrelevant files/row_groups, while filter 
pushdown skips irrelevant rows. Without filter pushdown, all rows from 
location, val, and date_time columns are decoded before `location='office'` is 
evaluated. Filter pushdown is especially useful when the filter is selective, 
i.e., removes many rows." class="img-responsive" 
src="/blog/images/parquet-pushdown/pushdown-vs-no-pushdown.jpg" width="80%"/&gt;
+&lt;figcaption&gt;
+    Parquet pruning skips irrelevant files/row_groups, while filter pushdown 
skips irrelevant rows. Without filter pushdown, all rows from location, val, 
and date_time columns are decoded before `location='office'` is evaluated. 
Filter pushdown is especially useful when the filter is selective, i.e., 
removes many rows.
+  &lt;/figcaption&gt;
+&lt;/figure&gt;
+&lt;p&gt;In our setup, sensor data is aggregated by date &amp;mdash; each day 
has its own Parquet file.
+At planning time, DataFusion prunes the unneeded Parquet files, i.e., 
&lt;code&gt;2025-03-10.parquet&lt;/code&gt; and 
&lt;code&gt;2025-03-11.parquet&lt;/code&gt;.&lt;/p&gt;
+&lt;p&gt;Once the files to read are located, the &lt;a 
href="https://github.com/apache/datafusion/issues/3463"&gt;&lt;em&gt;DataFusion's
 current default implementation&lt;/em&gt;&lt;/a&gt; reads all the projected 
columns (&lt;code&gt;sensor_id&lt;/code&gt;, &lt;code&gt;val&lt;/code&gt;, and 
&lt;code&gt;location&lt;/code&gt;) into Arrow RecordBatches, then applies the 
filters over &lt;code&gt;location&lt;/code&gt; to get the final set of 
rows.&lt;/p&gt;
+&lt;p&gt;A better approach is called &lt;strong&gt;filter 
pushdown&lt;/strong&gt; with &lt;strong&gt;late materialization&lt;/strong&gt;, 
which evaluates filter conditions first and only decodes data that passes these 
conditions.
+In practice, this works by first processing only the filter columns 
(&lt;code&gt;date_time&lt;/code&gt; and &lt;code&gt;location&lt;/code&gt;), 
building a boolean mask of rows that satisfy our conditions, then using this 
mask to selectively decode only the relevant rows from other columns 
(&lt;code&gt;sensor_id&lt;/code&gt;, &lt;code&gt;val&lt;/code&gt;). 
+This eliminates the waste of decoding rows that will be immediately filtered 
out.&lt;/p&gt;
+&lt;p&gt;While simple in theory, practical implementations often make 
performance worse.&lt;/p&gt;
+&lt;h2&gt;How can filter pushdown be slower?&lt;/h2&gt;
+&lt;p&gt;At a high level, the Parquet reader first builds a filter mask -- 
essentially a boolean array indicating which rows meet the filter criteria -- 
and then uses this mask to selectively decode only the needed rows from the 
remaining columns in the projection.&lt;/p&gt;
+&lt;p&gt;Let's dig into details of &lt;a 
href="https://github.com/apache/arrow-rs/blob/d5339f31a60a4bd8a4256e7120fe32603249d88e/parquet/src/arrow/async_reader/mod.rs#L618-L712"&gt;how
 filter pushdown is implemented&lt;/a&gt; in the current Rust Parquet reader 
implementation, illustrated in the following figure.&lt;/p&gt;
+&lt;figure&gt;
+&lt;img alt="Implementation of filter pushdown in Rust Parquet readers" 
class="img-responsive" src="/blog/images/parquet-pushdown/baseline-impl.jpg" 
with="70%"/&gt;
+&lt;figcaption&gt;
+    Implementation of filter pushdown in Rust Parquet readers -- the first 
phase builds the filter mask, the second phase applies the filter mask to the 
other columns
+  &lt;/figcaption&gt;
+&lt;/figure&gt;
+&lt;p&gt;The filter pushdown has two phases:&lt;/p&gt;
+&lt;ol&gt;
+&lt;li&gt;
+&lt;p&gt;Build the filter mask (steps 1-3)&lt;/p&gt;
+&lt;/li&gt;
+&lt;li&gt;
+&lt;p&gt;Use the filter mask to selectively decode other columns (steps 4-7), 
e.g., output step 3 is used as input for step 5 and 7.&lt;/p&gt;
+&lt;/li&gt;
+&lt;/ol&gt;
+&lt;p&gt;Within each phase, it takes three steps from Parquet to 
Arrow:&lt;/p&gt;
+&lt;ol&gt;
+&lt;li&gt;
+&lt;p&gt;Decompress the Parquet pages using generic decompression algorithms 
like LZ4, Zstd, etc. (steps 1, 4, 6)&lt;/p&gt;
+&lt;/li&gt;
+&lt;li&gt;
+&lt;p&gt;Decode the page content into Arrow format (steps 2, 5, 7)&lt;/p&gt;
+&lt;/li&gt;
+&lt;li&gt;
+&lt;p&gt;Evaluate the filter over Arrow data (step 3)&lt;/p&gt;
+&lt;/li&gt;
+&lt;/ol&gt;
+&lt;p&gt;In the figure above, we can see that 
&lt;code&gt;location&lt;/code&gt; is &lt;strong&gt;decompressed and decoded 
twice&lt;/strong&gt;, first when building the filter mask (steps 1, 2), and 
second when building the output (steps 4, 5).
+This happens for all columns that appear both in the filter and 
output.&lt;/p&gt;
+&lt;p&gt;The table below shows the corresponding CPU time on the &lt;a 
href="https://github.com/apache/datafusion/blob/main/benchmarks/queries/clickbench/queries.sql#L23"&gt;ClickBench
 query 22&lt;/a&gt;:&lt;/p&gt;
+&lt;div 
class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span 
class="err"&gt;+------------+--------+-------------+--------+&lt;/span&gt;
+&lt;span class="err"&gt;| Decompress | Decode | Apply filter| Others 
|&lt;/span&gt;
+&lt;span 
class="err"&gt;+------------+--------+-------------+--------+&lt;/span&gt;
+&lt;span class="err"&gt;| 206 ms     | 117 ms | 22 ms       | 48 ms  
|&lt;/span&gt;
+&lt;span 
class="err"&gt;+------------+--------+-------------+--------+&lt;/span&gt;
+&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+&lt;p&gt;Clearly, decompress/decode operations dominate the time spent. With 
filter pushdown, it needs to decompress/decode twice; but without filter 
pushdown, it only needs to do this once.
+This explains why filter pushdown is slower in some cases.&lt;/p&gt;
+&lt;blockquote&gt;
+&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Highly selective filters may skip 
the entire page; but as long as it reads one row from the page, it needs to 
decompress and often decode the entire page.&lt;/p&gt;
+&lt;/blockquote&gt;
+&lt;h2&gt;Attempt: cache filter columns&lt;/h2&gt;
+&lt;p&gt;Intuitively, caching the filter columns and reusing them later could 
help.&lt;/p&gt;
+&lt;p&gt;But naively caching decoded pages consumes prohibitively high 
memory:&lt;/p&gt;
+&lt;ol&gt;
+&lt;li&gt;
+&lt;p&gt;It needs to cache Arrow arrays, which are on average &lt;a 
href="https://github.com/XiangpengHao/liquid-cache/blob/main/dev/doc/liquid-cache-vldb.pdf"&gt;4x
 larger than Parquet data&lt;/a&gt;.&lt;/p&gt;
+&lt;/li&gt;
+&lt;li&gt;
+&lt;p&gt;It needs to cache the &lt;strong&gt;entire column chunk in 
memory&lt;/strong&gt;, because in Phase 1 it builds filters over the column 
chunk, and only use it in Phase 2.  &lt;/p&gt;
+&lt;/li&gt;
+&lt;li&gt;
+&lt;p&gt;The memory usage is proportional to the number of filter columns, 
which can be unboundedly high. &lt;/p&gt;
+&lt;/li&gt;
+&lt;/ol&gt;
+&lt;p&gt;Worse, caching filter columns means it needs to read partially from 
Parquet and partially from cache, which is complex to implement, likely 
requiring a substantial change to the current implementation. &lt;/p&gt;
+&lt;blockquote&gt;
+&lt;p&gt;&lt;strong&gt;Feel the complexity:&lt;/strong&gt; consider building a 
cache that properly handles nested columns, multiple filters, and filters with 
multiple columns.&lt;/p&gt;
+&lt;/blockquote&gt;
+&lt;h2&gt;Real solution&lt;/h2&gt;
+&lt;p&gt;We need a solution that:&lt;/p&gt;
+&lt;ol&gt;
+&lt;li&gt;
+&lt;p&gt;Is simple to implement, i.e., doesn't require thousands of lines of 
code.&lt;/p&gt;
+&lt;/li&gt;
+&lt;li&gt;
+&lt;p&gt;Incurs minimal memory overhead.&lt;/p&gt;
+&lt;/li&gt;
+&lt;/ol&gt;
+&lt;p&gt;This section describes my &lt;a 
href="https://github.com/apache/arrow-rs/pull/6921#issuecomment-2718792433"&gt;&amp;lt;700
 LOC PR (with lots of comments and tests)&lt;/a&gt; that &lt;strong&gt;reduces 
total ClickBench time by 15%, with up to 2x lower latency for some queries, no 
obvious regression on other queries, and caches at most 2 pages (~2MB) per 
column in memory&lt;/strong&gt;.&lt;/p&gt;
+&lt;figure&gt;
+&lt;img alt="New decoding pipeline, building filter mask and output columns 
are interleaved in a single pass, allowing us to cache minimal pages for 
minimal amount of time" class="img-responsive" 
src="/blog/images/parquet-pushdown/new-pipeline.jpg" width="80%"/&gt;
+&lt;figcaption&gt;
+    New decoding pipeline, building filter mask and output columns are 
interleaved in a single pass, allowing us to cache minimal pages for minimal 
amount of time
+  &lt;/figcaption&gt;
+&lt;/figure&gt;
+&lt;p&gt;The new pipeline interleaves the previous two phases into a single 
pass, so that:&lt;/p&gt;
+&lt;ol&gt;
+&lt;li&gt;
+&lt;p&gt;The page being decompressed is immediately used to build filter masks 
and output columns.&lt;/p&gt;
+&lt;/li&gt;
+&lt;li&gt;
+&lt;p&gt;Decompressed pages are cached for minimal time; after one pass (steps 
1-6), the cache memory is released for the next pass. &lt;/p&gt;
+&lt;/li&gt;
+&lt;/ol&gt;
+&lt;p&gt;This allows the cache to only hold 1 page at a time, and to 
immediately discard the previous page after it's used, significantly reducing 
the memory requirement for caching.&lt;/p&gt;
+&lt;h3&gt;What pages are cached?&lt;/h3&gt;
+&lt;p&gt;You may have noticed that only &lt;code&gt;location&lt;/code&gt; is 
cached, not &lt;code&gt;val&lt;/code&gt;, because &lt;code&gt;val&lt;/code&gt; 
is only used for output.
+More generally, only columns that appear both in the filter and output are 
cached, and at most 1 page is cached for each such column.&lt;/p&gt;
+&lt;p&gt;More examples:&lt;/p&gt;
+&lt;div 
class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span 
class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt; 
+&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span 
class="n"&gt;sensor_data&lt;/span&gt; 
+&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span 
class="n"&gt;date_time&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; 
&lt;span class="s1"&gt;'2025-03-11'&lt;/span&gt; &lt;span 
class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;location&lt;/span&gt; 
&lt;span class="o"&gt;=&lt;/span&gt; &lt;span 
class="s1"&gt;'office'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
+&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+&lt;p&gt;In this case, no columns are cached, because 
&lt;code&gt;val&lt;/code&gt; is not used for filtering.&lt;/p&gt;
+&lt;div 
class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span 
class="k"&gt;SELECT&lt;/span&gt; &lt;span 
class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span 
class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
+&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span 
class="n"&gt;sensor_data&lt;/span&gt; 
+&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span 
class="n"&gt;date_time&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; 
&lt;span class="s1"&gt;'2025-03-11'&lt;/span&gt; &lt;span 
class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;location&lt;/span&gt; 
&lt;span class="o"&gt;=&lt;/span&gt; &lt;span 
class="s1"&gt;'office'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
+&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+&lt;p&gt;In this case, again, no columns are cached, because the output 
projection is empty after query plan optimization.&lt;/p&gt;
+&lt;h3&gt;Then why cache 2 pages per column instead of 1?&lt;/h3&gt;
+&lt;p&gt;This is another real-world nuance regarding how Parquet layouts the 
pages.&lt;/p&gt;
+&lt;p&gt;Parquet by default encodes data using &lt;a 
href="https://parquet.apache.org/docs/file-format/data-pages/encodings/"&gt;dictionary
 encoding&lt;/a&gt;, which writes a dictionary page as the first page of a 
column chunk, followed by the keys referencing the dictionary.&lt;/p&gt;
+&lt;p&gt;You can see this in action using &lt;a 
href="https://parquet-viewer.xiangpeng.systems"&gt;parquet-viewer&lt;/a&gt;:&lt;/p&gt;
+&lt;figure&gt;
+&lt;img alt="Parquet viewer shows the page layout of a column chunk" 
class="img-responsive" src="/blog/images/parquet-pushdown/parquet-viewer.jpg" 
width="80%"/&gt;
+&lt;figcaption&gt;
+    Parquet viewer shows the page layout of a column chunk
+  &lt;/figcaption&gt;
+&lt;/figure&gt;
+&lt;p&gt;This means that to decode a page of data, it actually references two 
pages: the dictionary page and the data page.&lt;/p&gt;
+&lt;p&gt;This is why it caches 2 pages per column: one dictionary page and one 
data page.
+The data page slot will move forward as it reads the data; but the dictionary 
page slot always references the first page.&lt;/p&gt;
+&lt;figure&gt;
+&lt;img alt="Cached two pages, one for dictionary (pinned), one for data 
(moves as it reads the data)" class="img-responsive" 
src="/blog/images/parquet-pushdown/cached-pages.jpg" width="80%"/&gt;
+&lt;figcaption&gt;
+    Cached two pages, one for dictionary (pinned), one for data (moves as it 
reads the data)
+  &lt;/figcaption&gt;
+&lt;/figure&gt;
+&lt;h2&gt;How does it perform?&lt;/h2&gt;
+&lt;p&gt;Here are my results on &lt;a 
href="https://github.com/apache/datafusion/tree/main/benchmarks#clickbench"&gt;ClickBench&lt;/a&gt;
 on my AMD 9900X machine. The total time is reduced by 15%, with Q23 being 
2.24x faster,
+and queries that get slower are likely due to noise.&lt;/p&gt;
+&lt;div 
class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span 
class="err"&gt;┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓&lt;/span&gt;
+&lt;span class="err"&gt;┃ Query        ┃ no-pushdown ┃ new-pushdown ┃        
Change ┃&lt;/span&gt;
+&lt;span 
class="err"&gt;┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 0     &amp;boxv;      0.47ms 
&amp;boxv;       0.43ms &amp;boxv; +1.10x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 1     &amp;boxv;     51.10ms 
&amp;boxv;      50.10ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 2     &amp;boxv;     68.23ms 
&amp;boxv;      64.49ms &amp;boxv; +1.06x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 3     &amp;boxv;     90.68ms 
&amp;boxv;      86.73ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 4     &amp;boxv;    458.93ms 
&amp;boxv;     458.59ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 5     &amp;boxv;    522.06ms 
&amp;boxv;     478.50ms &amp;boxv; +1.09x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 6     &amp;boxv;     49.84ms 
&amp;boxv;      49.94ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 7     &amp;boxv;     55.09ms 
&amp;boxv;      55.77ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 8     &amp;boxv;    565.26ms 
&amp;boxv;     556.95ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 9     &amp;boxv;    575.83ms 
&amp;boxv;     575.05ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 10    &amp;boxv;    164.56ms 
&amp;boxv;     178.23ms &amp;boxv;  1.08x slower &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 11    &amp;boxv;    177.20ms 
&amp;boxv;     191.32ms &amp;boxv;  1.08x slower &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 12    &amp;boxv;    591.05ms 
&amp;boxv;     569.92ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 13    &amp;boxv;    861.06ms 
&amp;boxv;     848.59ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 14    &amp;boxv;    596.20ms 
&amp;boxv;     580.73ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 15    &amp;boxv;    554.96ms 
&amp;boxv;     548.77ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 16    &amp;boxv;   1175.08ms 
&amp;boxv;    1146.07ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 17    &amp;boxv;   1150.45ms 
&amp;boxv;    1121.49ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 18    &amp;boxv;   2634.75ms 
&amp;boxv;    2494.07ms &amp;boxv; +1.06x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 19    &amp;boxv;     90.15ms 
&amp;boxv;      89.24ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 20    &amp;boxv;    620.15ms 
&amp;boxv;     591.67ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 21    &amp;boxv;    782.38ms 
&amp;boxv;     703.15ms &amp;boxv; +1.11x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 22    &amp;boxv;   1927.94ms 
&amp;boxv;    1404.35ms &amp;boxv; +1.37x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 23    &amp;boxv;   8104.11ms 
&amp;boxv;    3610.76ms &amp;boxv; +2.24x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 24    &amp;boxv;    360.79ms 
&amp;boxv;     330.55ms &amp;boxv; +1.09x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 25    &amp;boxv;    290.61ms 
&amp;boxv;     252.54ms &amp;boxv; +1.15x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 26    &amp;boxv;    395.18ms 
&amp;boxv;     362.72ms &amp;boxv; +1.09x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 27    &amp;boxv;    891.76ms 
&amp;boxv;     959.39ms &amp;boxv;  1.08x slower &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 28    &amp;boxv;   4059.54ms 
&amp;boxv;    4137.37ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 29    &amp;boxv;    235.88ms 
&amp;boxv;     228.99ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 30    &amp;boxv;    564.22ms 
&amp;boxv;     584.65ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 31    &amp;boxv;    741.20ms 
&amp;boxv;     757.87ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 32    &amp;boxv;   2652.48ms 
&amp;boxv;    2574.19ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 33    &amp;boxv;   2373.71ms 
&amp;boxv;    2327.10ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 34    &amp;boxv;   2391.00ms 
&amp;boxv;    2342.15ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 35    &amp;boxv;    700.79ms 
&amp;boxv;     694.51ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 36    &amp;boxv;    151.51ms 
&amp;boxv;     152.93ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 37    &amp;boxv;    108.18ms 
&amp;boxv;      86.03ms &amp;boxv; +1.26x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 38    &amp;boxv;    114.64ms 
&amp;boxv;     106.22ms &amp;boxv; +1.08x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 39    &amp;boxv;    260.80ms 
&amp;boxv;     239.13ms &amp;boxv; +1.09x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 40    &amp;boxv;     60.74ms 
&amp;boxv;      73.29ms &amp;boxv;  1.21x slower &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 41    &amp;boxv;     58.75ms 
&amp;boxv;      67.85ms &amp;boxv;  1.15x slower &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 42    &amp;boxv;     65.49ms 
&amp;boxv;      68.11ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span 
class="err"&gt;&amp;boxur;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxhu;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxhu;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxhu;&amp;boxh;&amp
 [...]
+&lt;span 
class="err"&gt;┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓&lt;/span&gt;
+&lt;span class="err"&gt;┃ Benchmark Summary           ┃            
┃&lt;/span&gt;
+&lt;span 
class="err"&gt;┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; Total Time (no-pushdown)    &amp;boxv; 
38344.79ms &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; Total Time (new-pushdown)   &amp;boxv; 
32800.50ms &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; Average Time (no-pushdown)  &amp;boxv;   
891.74ms &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; Average Time (new-pushdown) &amp;boxv;   
762.80ms &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; Queries Faster              &amp;boxv;      
   13 &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; Queries Slower              &amp;boxv;      
    5 &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; Queries with No Change      &amp;boxv;      
   25 &amp;boxv;&lt;/span&gt;
+&lt;span 
class="err"&gt;&amp;boxur;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxhu;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxul;&lt;/span&gt;
+&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+&lt;h2&gt;Conclusion&lt;/h2&gt;
+&lt;p&gt;Despite being simple in theory, filter pushdown in Parquet is 
non-trivial to implement.
+It requires understanding both the Parquet format and reader implementation 
details. 
+The challenge lies in efficiently navigating through the dynamics of decoding, 
filter evaluation, and memory management.&lt;/p&gt;
+&lt;p&gt;If you are interested in this level of optimization and want to help 
test, document and implement this type of optimization, come find us in the 
&lt;a 
href="https://datafusion.apache.org/contributor-guide/communication.html"&gt;DataFusion
 Community&lt;/a&gt;. We would love to have you. &lt;/p&gt;</content><category 
term="blog"></category></entry><entry><title>Apache DataFusion Comet 0.7.0 
Release</title><link 
href="https://datafusion.apache.org/blog/2025/03/20/datafusion-comet-0 [...]
 {% comment %}
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
diff --git a/output/feeds/xiangpeng-hao.atom.xml 
b/output/feeds/xiangpeng-hao.atom.xml
index 8d6ba2f..0b818dd 100644
--- a/output/feeds/xiangpeng-hao.atom.xml
+++ b/output/feeds/xiangpeng-hao.atom.xml
@@ -1,5 +1,293 @@
 <?xml version="1.0" encoding="utf-8"?>
-<feed xmlns="http://www.w3.org/2005/Atom";><title>Apache DataFusion Blog - 
Xiangpeng Hao</title><link href="https://datafusion.apache.org/blog/"; 
rel="alternate"></link><link 
href="https://datafusion.apache.org/blog/feeds/xiangpeng-hao.atom.xml"; 
rel="self"></link><id>https://datafusion.apache.org/blog/</id><updated>2025-03-20T00:00:00+00:00</updated><subtitle></subtitle><entry><title>Parquet
 Pruning in DataFusion: Read Only What Matters</title><link 
href="https://datafusion.apache.org/blog [...]
+<feed xmlns="http://www.w3.org/2005/Atom";><title>Apache DataFusion Blog - 
Xiangpeng Hao</title><link href="https://datafusion.apache.org/blog/"; 
rel="alternate"></link><link 
href="https://datafusion.apache.org/blog/feeds/xiangpeng-hao.atom.xml"; 
rel="self"></link><id>https://datafusion.apache.org/blog/</id><updated>2025-03-21T00:00:00+00:00</updated><subtitle></subtitle><entry><title>Efficient
 Filter Pushdown in Parquet</title><link 
href="https://datafusion.apache.org/blog/2025/03/21/parqu [...]
+figure {
+  margin: 20px 0;
+}
+
+figure img {
+  display: block;
+  max-width: 80%;
+}
+
+figcaption {
+  font-style: italic;
+  margin-top: 10px;
+  color: #555;
+  font-size: 0.9em;
+  max-width: 80%;
+}
+&lt;/style&gt;
+&lt;!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+--&gt;
+&lt;p&gt;&lt;em&gt;Editor's Note: This blog was first published on &lt;a 
href="https://blog.xiangpeng.systems/posts/parquet-pushdown/"&gt;Xiangpeng 
Hao's blog&lt;/a&gt;. Thanks to &lt;a 
href="https://www.influxdata.com/"&gt;InfluxData&lt;/a&gt; for sponsoring this 
work as part of his PhD funding.&lt;/em&gt;&lt;/p&gt;
+&lt;hr/&gt;
+&lt;p&gt;In the &lt;a 
href="https://datafusion.apache.org/blog/2025/03/20/parquet-pruning"&gt;previous
 post …&lt;/a&gt;&lt;/p&gt;</summary><content type="html">&lt;style&gt;
+figure {
+  margin: 20px 0;
+}
+
+figure img {
+  display: block;
+  max-width: 80%;
+}
+
+figcaption {
+  font-style: italic;
+  margin-top: 10px;
+  color: #555;
+  font-size: 0.9em;
+  max-width: 80%;
+}
+&lt;/style&gt;
+&lt;!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+--&gt;
+&lt;p&gt;&lt;em&gt;Editor's Note: This blog was first published on &lt;a 
href="https://blog.xiangpeng.systems/posts/parquet-pushdown/"&gt;Xiangpeng 
Hao's blog&lt;/a&gt;. Thanks to &lt;a 
href="https://www.influxdata.com/"&gt;InfluxData&lt;/a&gt; for sponsoring this 
work as part of his PhD funding.&lt;/em&gt;&lt;/p&gt;
+&lt;hr/&gt;
+&lt;p&gt;In the &lt;a 
href="https://datafusion.apache.org/blog/2025/03/20/parquet-pruning"&gt;previous
 post&lt;/a&gt;, we discussed how &lt;a 
href="https://datafusion.apache.org/"&gt;Apache DataFusion&lt;/a&gt; prunes 
&lt;a href="https://parquet.apache.org/"&gt;Apache Parquet&lt;/a&gt; files to 
skip irrelevant &lt;strong&gt;files/row_groups&lt;/strong&gt; (sometimes also 
&lt;a 
href="https://parquet.apache.org/docs/file-format/pageindex/"&gt;pages&lt;/a&gt;).&lt;/p&gt;
+&lt;p&gt;This post discusses how Parquet readers skip irrelevant 
&lt;strong&gt;rows&lt;/strong&gt; while scanning data,
+leveraging Parquet's columnar layout by first reading only filter columns,
+and then selectively reading other columns only for matching rows.&lt;/p&gt;
+&lt;h2&gt;Why filter pushdown in Parquet?&lt;/h2&gt;
+&lt;p&gt;Below is an example query that reads sensor data with filters on 
&lt;code&gt;date_time&lt;/code&gt; and &lt;code&gt;location&lt;/code&gt;.
+Without filter pushdown, all rows from &lt;code&gt;location&lt;/code&gt;, 
&lt;code&gt;val&lt;/code&gt;, and &lt;code&gt;date_time&lt;/code&gt; columns 
are decoded before &lt;code&gt;location='office'&lt;/code&gt; is evaluated. 
Filter pushdown is especially useful when the filter is selective, i.e., 
removes many rows.&lt;/p&gt;
+&lt;div 
class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span 
class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;&lt;span 
class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;location&lt;/span&gt; 
+&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span 
class="n"&gt;sensor_data&lt;/span&gt; 
+&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span 
class="n"&gt;date_time&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; 
&lt;span class="s1"&gt;'2025-03-11'&lt;/span&gt; &lt;span 
class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;location&lt;/span&gt; 
&lt;span class="o"&gt;=&lt;/span&gt; &lt;span 
class="s1"&gt;'office'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
+&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+&lt;figure&gt;
+&lt;img alt="Parquet pruning skips irrelevant files/row_groups, while filter 
pushdown skips irrelevant rows. Without filter pushdown, all rows from 
location, val, and date_time columns are decoded before `location='office'` is 
evaluated. Filter pushdown is especially useful when the filter is selective, 
i.e., removes many rows." class="img-responsive" 
src="/blog/images/parquet-pushdown/pushdown-vs-no-pushdown.jpg" width="80%"/&gt;
+&lt;figcaption&gt;
+    Parquet pruning skips irrelevant files/row_groups, while filter pushdown 
skips irrelevant rows. Without filter pushdown, all rows from location, val, 
and date_time columns are decoded before `location='office'` is evaluated. 
Filter pushdown is especially useful when the filter is selective, i.e., 
removes many rows.
+  &lt;/figcaption&gt;
+&lt;/figure&gt;
+&lt;p&gt;In our setup, sensor data is aggregated by date &amp;mdash; each day 
has its own Parquet file.
+At planning time, DataFusion prunes the unneeded Parquet files, i.e., 
&lt;code&gt;2025-03-10.parquet&lt;/code&gt; and 
&lt;code&gt;2025-03-11.parquet&lt;/code&gt;.&lt;/p&gt;
+&lt;p&gt;Once the files to read are located, the &lt;a 
href="https://github.com/apache/datafusion/issues/3463"&gt;&lt;em&gt;DataFusion's
 current default implementation&lt;/em&gt;&lt;/a&gt; reads all the projected 
columns (&lt;code&gt;sensor_id&lt;/code&gt;, &lt;code&gt;val&lt;/code&gt;, and 
&lt;code&gt;location&lt;/code&gt;) into Arrow RecordBatches, then applies the 
filters over &lt;code&gt;location&lt;/code&gt; to get the final set of 
rows.&lt;/p&gt;
+&lt;p&gt;A better approach is called &lt;strong&gt;filter 
pushdown&lt;/strong&gt; with &lt;strong&gt;late materialization&lt;/strong&gt;, 
which evaluates filter conditions first and only decodes data that passes these 
conditions.
+In practice, this works by first processing only the filter columns 
(&lt;code&gt;date_time&lt;/code&gt; and &lt;code&gt;location&lt;/code&gt;), 
building a boolean mask of rows that satisfy our conditions, then using this 
mask to selectively decode only the relevant rows from other columns 
(&lt;code&gt;sensor_id&lt;/code&gt;, &lt;code&gt;val&lt;/code&gt;). 
+This eliminates the waste of decoding rows that will be immediately filtered 
out.&lt;/p&gt;
+&lt;p&gt;While simple in theory, practical implementations often make 
performance worse.&lt;/p&gt;
+&lt;h2&gt;How can filter pushdown be slower?&lt;/h2&gt;
+&lt;p&gt;At a high level, the Parquet reader first builds a filter mask -- 
essentially a boolean array indicating which rows meet the filter criteria -- 
and then uses this mask to selectively decode only the needed rows from the 
remaining columns in the projection.&lt;/p&gt;
+&lt;p&gt;Let's dig into details of &lt;a 
href="https://github.com/apache/arrow-rs/blob/d5339f31a60a4bd8a4256e7120fe32603249d88e/parquet/src/arrow/async_reader/mod.rs#L618-L712"&gt;how
 filter pushdown is implemented&lt;/a&gt; in the current Rust Parquet reader 
implementation, illustrated in the following figure.&lt;/p&gt;
+&lt;figure&gt;
+&lt;img alt="Implementation of filter pushdown in Rust Parquet readers" 
class="img-responsive" src="/blog/images/parquet-pushdown/baseline-impl.jpg" 
with="70%"/&gt;
+&lt;figcaption&gt;
+    Implementation of filter pushdown in Rust Parquet readers -- the first 
phase builds the filter mask, the second phase applies the filter mask to the 
other columns
+  &lt;/figcaption&gt;
+&lt;/figure&gt;
+&lt;p&gt;The filter pushdown has two phases:&lt;/p&gt;
+&lt;ol&gt;
+&lt;li&gt;
+&lt;p&gt;Build the filter mask (steps 1-3)&lt;/p&gt;
+&lt;/li&gt;
+&lt;li&gt;
+&lt;p&gt;Use the filter mask to selectively decode other columns (steps 4-7), 
e.g., output step 3 is used as input for step 5 and 7.&lt;/p&gt;
+&lt;/li&gt;
+&lt;/ol&gt;
+&lt;p&gt;Within each phase, it takes three steps from Parquet to 
Arrow:&lt;/p&gt;
+&lt;ol&gt;
+&lt;li&gt;
+&lt;p&gt;Decompress the Parquet pages using generic decompression algorithms 
like LZ4, Zstd, etc. (steps 1, 4, 6)&lt;/p&gt;
+&lt;/li&gt;
+&lt;li&gt;
+&lt;p&gt;Decode the page content into Arrow format (steps 2, 5, 7)&lt;/p&gt;
+&lt;/li&gt;
+&lt;li&gt;
+&lt;p&gt;Evaluate the filter over Arrow data (step 3)&lt;/p&gt;
+&lt;/li&gt;
+&lt;/ol&gt;
+&lt;p&gt;In the figure above, we can see that 
&lt;code&gt;location&lt;/code&gt; is &lt;strong&gt;decompressed and decoded 
twice&lt;/strong&gt;, first when building the filter mask (steps 1, 2), and 
second when building the output (steps 4, 5).
+This happens for all columns that appear both in the filter and 
output.&lt;/p&gt;
+&lt;p&gt;The table below shows the corresponding CPU time on the &lt;a 
href="https://github.com/apache/datafusion/blob/main/benchmarks/queries/clickbench/queries.sql#L23"&gt;ClickBench
 query 22&lt;/a&gt;:&lt;/p&gt;
+&lt;div 
class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span 
class="err"&gt;+------------+--------+-------------+--------+&lt;/span&gt;
+&lt;span class="err"&gt;| Decompress | Decode | Apply filter| Others 
|&lt;/span&gt;
+&lt;span 
class="err"&gt;+------------+--------+-------------+--------+&lt;/span&gt;
+&lt;span class="err"&gt;| 206 ms     | 117 ms | 22 ms       | 48 ms  
|&lt;/span&gt;
+&lt;span 
class="err"&gt;+------------+--------+-------------+--------+&lt;/span&gt;
+&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+&lt;p&gt;Clearly, decompress/decode operations dominate the time spent. With 
filter pushdown, it needs to decompress/decode twice; but without filter 
pushdown, it only needs to do this once.
+This explains why filter pushdown is slower in some cases.&lt;/p&gt;
+&lt;blockquote&gt;
+&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Highly selective filters may skip 
the entire page; but as long as it reads one row from the page, it needs to 
decompress and often decode the entire page.&lt;/p&gt;
+&lt;/blockquote&gt;
+&lt;h2&gt;Attempt: cache filter columns&lt;/h2&gt;
+&lt;p&gt;Intuitively, caching the filter columns and reusing them later could 
help.&lt;/p&gt;
+&lt;p&gt;But naively caching decoded pages consumes prohibitively high 
memory:&lt;/p&gt;
+&lt;ol&gt;
+&lt;li&gt;
+&lt;p&gt;It needs to cache Arrow arrays, which are on average &lt;a 
href="https://github.com/XiangpengHao/liquid-cache/blob/main/dev/doc/liquid-cache-vldb.pdf"&gt;4x
 larger than Parquet data&lt;/a&gt;.&lt;/p&gt;
+&lt;/li&gt;
+&lt;li&gt;
+&lt;p&gt;It needs to cache the &lt;strong&gt;entire column chunk in 
memory&lt;/strong&gt;, because in Phase 1 it builds filters over the column 
chunk, and only use it in Phase 2.  &lt;/p&gt;
+&lt;/li&gt;
+&lt;li&gt;
+&lt;p&gt;The memory usage is proportional to the number of filter columns, 
which can be unboundedly high. &lt;/p&gt;
+&lt;/li&gt;
+&lt;/ol&gt;
+&lt;p&gt;Worse, caching filter columns means it needs to read partially from 
Parquet and partially from cache, which is complex to implement, likely 
requiring a substantial change to the current implementation. &lt;/p&gt;
+&lt;blockquote&gt;
+&lt;p&gt;&lt;strong&gt;Feel the complexity:&lt;/strong&gt; consider building a 
cache that properly handles nested columns, multiple filters, and filters with 
multiple columns.&lt;/p&gt;
+&lt;/blockquote&gt;
+&lt;h2&gt;Real solution&lt;/h2&gt;
+&lt;p&gt;We need a solution that:&lt;/p&gt;
+&lt;ol&gt;
+&lt;li&gt;
+&lt;p&gt;Is simple to implement, i.e., doesn't require thousands of lines of 
code.&lt;/p&gt;
+&lt;/li&gt;
+&lt;li&gt;
+&lt;p&gt;Incurs minimal memory overhead.&lt;/p&gt;
+&lt;/li&gt;
+&lt;/ol&gt;
+&lt;p&gt;This section describes my &lt;a 
href="https://github.com/apache/arrow-rs/pull/6921#issuecomment-2718792433"&gt;&amp;lt;700
 LOC PR (with lots of comments and tests)&lt;/a&gt; that &lt;strong&gt;reduces 
total ClickBench time by 15%, with up to 2x lower latency for some queries, no 
obvious regression on other queries, and caches at most 2 pages (~2MB) per 
column in memory&lt;/strong&gt;.&lt;/p&gt;
+&lt;figure&gt;
+&lt;img alt="New decoding pipeline, building filter mask and output columns 
are interleaved in a single pass, allowing us to cache minimal pages for 
minimal amount of time" class="img-responsive" 
src="/blog/images/parquet-pushdown/new-pipeline.jpg" width="80%"/&gt;
+&lt;figcaption&gt;
+    New decoding pipeline, building filter mask and output columns are 
interleaved in a single pass, allowing us to cache minimal pages for minimal 
amount of time
+  &lt;/figcaption&gt;
+&lt;/figure&gt;
+&lt;p&gt;The new pipeline interleaves the previous two phases into a single 
pass, so that:&lt;/p&gt;
+&lt;ol&gt;
+&lt;li&gt;
+&lt;p&gt;The page being decompressed is immediately used to build filter masks 
and output columns.&lt;/p&gt;
+&lt;/li&gt;
+&lt;li&gt;
+&lt;p&gt;Decompressed pages are cached for minimal time; after one pass (steps 
1-6), the cache memory is released for the next pass. &lt;/p&gt;
+&lt;/li&gt;
+&lt;/ol&gt;
+&lt;p&gt;This allows the cache to only hold 1 page at a time, and to 
immediately discard the previous page after it's used, significantly reducing 
the memory requirement for caching.&lt;/p&gt;
+&lt;h3&gt;What pages are cached?&lt;/h3&gt;
+&lt;p&gt;You may have noticed that only &lt;code&gt;location&lt;/code&gt; is 
cached, not &lt;code&gt;val&lt;/code&gt;, because &lt;code&gt;val&lt;/code&gt; 
is only used for output.
+More generally, only columns that appear both in the filter and output are 
cached, and at most 1 page is cached for each such column.&lt;/p&gt;
+&lt;p&gt;More examples:&lt;/p&gt;
+&lt;div 
class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span 
class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt; 
+&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span 
class="n"&gt;sensor_data&lt;/span&gt; 
+&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span 
class="n"&gt;date_time&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; 
&lt;span class="s1"&gt;'2025-03-11'&lt;/span&gt; &lt;span 
class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;location&lt;/span&gt; 
&lt;span class="o"&gt;=&lt;/span&gt; &lt;span 
class="s1"&gt;'office'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
+&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+&lt;p&gt;In this case, no columns are cached, because 
&lt;code&gt;val&lt;/code&gt; is not used for filtering.&lt;/p&gt;
+&lt;div 
class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span 
class="k"&gt;SELECT&lt;/span&gt; &lt;span 
class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span 
class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
+&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span 
class="n"&gt;sensor_data&lt;/span&gt; 
+&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span 
class="n"&gt;date_time&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; 
&lt;span class="s1"&gt;'2025-03-11'&lt;/span&gt; &lt;span 
class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;location&lt;/span&gt; 
&lt;span class="o"&gt;=&lt;/span&gt; &lt;span 
class="s1"&gt;'office'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
+&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+&lt;p&gt;In this case, again, no columns are cached, because the output 
projection is empty after query plan optimization.&lt;/p&gt;
+&lt;h3&gt;Then why cache 2 pages per column instead of 1?&lt;/h3&gt;
+&lt;p&gt;This is another real-world nuance regarding how Parquet layouts the 
pages.&lt;/p&gt;
+&lt;p&gt;Parquet by default encodes data using &lt;a 
href="https://parquet.apache.org/docs/file-format/data-pages/encodings/"&gt;dictionary
 encoding&lt;/a&gt;, which writes a dictionary page as the first page of a 
column chunk, followed by the keys referencing the dictionary.&lt;/p&gt;
+&lt;p&gt;You can see this in action using &lt;a 
href="https://parquet-viewer.xiangpeng.systems"&gt;parquet-viewer&lt;/a&gt;:&lt;/p&gt;
+&lt;figure&gt;
+&lt;img alt="Parquet viewer shows the page layout of a column chunk" 
class="img-responsive" src="/blog/images/parquet-pushdown/parquet-viewer.jpg" 
width="80%"/&gt;
+&lt;figcaption&gt;
+    Parquet viewer shows the page layout of a column chunk
+  &lt;/figcaption&gt;
+&lt;/figure&gt;
+&lt;p&gt;This means that to decode a page of data, it actually references two 
pages: the dictionary page and the data page.&lt;/p&gt;
+&lt;p&gt;This is why it caches 2 pages per column: one dictionary page and one 
data page.
+The data page slot will move forward as it reads the data; but the dictionary 
page slot always references the first page.&lt;/p&gt;
+&lt;figure&gt;
+&lt;img alt="Cached two pages, one for dictionary (pinned), one for data 
(moves as it reads the data)" class="img-responsive" 
src="/blog/images/parquet-pushdown/cached-pages.jpg" width="80%"/&gt;
+&lt;figcaption&gt;
+    Cached two pages, one for dictionary (pinned), one for data (moves as it 
reads the data)
+  &lt;/figcaption&gt;
+&lt;/figure&gt;
+&lt;h2&gt;How does it perform?&lt;/h2&gt;
+&lt;p&gt;Here are my results on &lt;a 
href="https://github.com/apache/datafusion/tree/main/benchmarks#clickbench"&gt;ClickBench&lt;/a&gt;
 on my AMD 9900X machine. The total time is reduced by 15%, with Q23 being 
2.24x faster,
+and queries that get slower are likely due to noise.&lt;/p&gt;
+&lt;div 
class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span 
class="err"&gt;┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓&lt;/span&gt;
+&lt;span class="err"&gt;┃ Query        ┃ no-pushdown ┃ new-pushdown ┃        
Change ┃&lt;/span&gt;
+&lt;span 
class="err"&gt;┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 0     &amp;boxv;      0.47ms 
&amp;boxv;       0.43ms &amp;boxv; +1.10x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 1     &amp;boxv;     51.10ms 
&amp;boxv;      50.10ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 2     &amp;boxv;     68.23ms 
&amp;boxv;      64.49ms &amp;boxv; +1.06x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 3     &amp;boxv;     90.68ms 
&amp;boxv;      86.73ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 4     &amp;boxv;    458.93ms 
&amp;boxv;     458.59ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 5     &amp;boxv;    522.06ms 
&amp;boxv;     478.50ms &amp;boxv; +1.09x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 6     &amp;boxv;     49.84ms 
&amp;boxv;      49.94ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 7     &amp;boxv;     55.09ms 
&amp;boxv;      55.77ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 8     &amp;boxv;    565.26ms 
&amp;boxv;     556.95ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 9     &amp;boxv;    575.83ms 
&amp;boxv;     575.05ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 10    &amp;boxv;    164.56ms 
&amp;boxv;     178.23ms &amp;boxv;  1.08x slower &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 11    &amp;boxv;    177.20ms 
&amp;boxv;     191.32ms &amp;boxv;  1.08x slower &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 12    &amp;boxv;    591.05ms 
&amp;boxv;     569.92ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 13    &amp;boxv;    861.06ms 
&amp;boxv;     848.59ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 14    &amp;boxv;    596.20ms 
&amp;boxv;     580.73ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 15    &amp;boxv;    554.96ms 
&amp;boxv;     548.77ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 16    &amp;boxv;   1175.08ms 
&amp;boxv;    1146.07ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 17    &amp;boxv;   1150.45ms 
&amp;boxv;    1121.49ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 18    &amp;boxv;   2634.75ms 
&amp;boxv;    2494.07ms &amp;boxv; +1.06x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 19    &amp;boxv;     90.15ms 
&amp;boxv;      89.24ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 20    &amp;boxv;    620.15ms 
&amp;boxv;     591.67ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 21    &amp;boxv;    782.38ms 
&amp;boxv;     703.15ms &amp;boxv; +1.11x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 22    &amp;boxv;   1927.94ms 
&amp;boxv;    1404.35ms &amp;boxv; +1.37x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 23    &amp;boxv;   8104.11ms 
&amp;boxv;    3610.76ms &amp;boxv; +2.24x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 24    &amp;boxv;    360.79ms 
&amp;boxv;     330.55ms &amp;boxv; +1.09x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 25    &amp;boxv;    290.61ms 
&amp;boxv;     252.54ms &amp;boxv; +1.15x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 26    &amp;boxv;    395.18ms 
&amp;boxv;     362.72ms &amp;boxv; +1.09x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 27    &amp;boxv;    891.76ms 
&amp;boxv;     959.39ms &amp;boxv;  1.08x slower &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 28    &amp;boxv;   4059.54ms 
&amp;boxv;    4137.37ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 29    &amp;boxv;    235.88ms 
&amp;boxv;     228.99ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 30    &amp;boxv;    564.22ms 
&amp;boxv;     584.65ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 31    &amp;boxv;    741.20ms 
&amp;boxv;     757.87ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 32    &amp;boxv;   2652.48ms 
&amp;boxv;    2574.19ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 33    &amp;boxv;   2373.71ms 
&amp;boxv;    2327.10ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 34    &amp;boxv;   2391.00ms 
&amp;boxv;    2342.15ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 35    &amp;boxv;    700.79ms 
&amp;boxv;     694.51ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 36    &amp;boxv;    151.51ms 
&amp;boxv;     152.93ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 37    &amp;boxv;    108.18ms 
&amp;boxv;      86.03ms &amp;boxv; +1.26x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 38    &amp;boxv;    114.64ms 
&amp;boxv;     106.22ms &amp;boxv; +1.08x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 39    &amp;boxv;    260.80ms 
&amp;boxv;     239.13ms &amp;boxv; +1.09x faster &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 40    &amp;boxv;     60.74ms 
&amp;boxv;      73.29ms &amp;boxv;  1.21x slower &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 41    &amp;boxv;     58.75ms 
&amp;boxv;      67.85ms &amp;boxv;  1.15x slower &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; QQuery 42    &amp;boxv;     65.49ms 
&amp;boxv;      68.11ms &amp;boxv;     no change &amp;boxv;&lt;/span&gt;
+&lt;span 
class="err"&gt;&amp;boxur;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxhu;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxhu;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxhu;&amp;boxh;&amp
 [...]
+&lt;span 
class="err"&gt;┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓&lt;/span&gt;
+&lt;span class="err"&gt;┃ Benchmark Summary           ┃            
┃&lt;/span&gt;
+&lt;span 
class="err"&gt;┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; Total Time (no-pushdown)    &amp;boxv; 
38344.79ms &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; Total Time (new-pushdown)   &amp;boxv; 
32800.50ms &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; Average Time (no-pushdown)  &amp;boxv;   
891.74ms &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; Average Time (new-pushdown) &amp;boxv;   
762.80ms &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; Queries Faster              &amp;boxv;      
   13 &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; Queries Slower              &amp;boxv;      
    5 &amp;boxv;&lt;/span&gt;
+&lt;span class="err"&gt;&amp;boxv; Queries with No Change      &amp;boxv;      
   25 &amp;boxv;&lt;/span&gt;
+&lt;span 
class="err"&gt;&amp;boxur;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxhu;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxul;&lt;/span&gt;
+&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
+&lt;h2&gt;Conclusion&lt;/h2&gt;
+&lt;p&gt;Despite being simple in theory, filter pushdown in Parquet is 
non-trivial to implement.
+It requires understanding both the Parquet format and reader implementation 
details. 
+The challenge lies in efficiently navigating through the dynamics of decoding, 
filter evaluation, and memory management.&lt;/p&gt;
+&lt;p&gt;If you are interested in this level of optimization and want to help 
test, document and implement this type of optimization, come find us in the 
&lt;a 
href="https://datafusion.apache.org/contributor-guide/communication.html"&gt;DataFusion
 Community&lt;/a&gt;. We would love to have you. &lt;/p&gt;</content><category 
term="blog"></category></entry><entry><title>Parquet Pruning in DataFusion: 
Read Only What Matters</title><link 
href="https://datafusion.apache.org/blog/2025/03/20/pa [...]
 {% comment %}
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
diff --git a/output/feeds/xiangpeng-hao.rss.xml 
b/output/feeds/xiangpeng-hao.rss.xml
index d48cdfa..bfe62fb 100644
--- a/output/feeds/xiangpeng-hao.rss.xml
+++ b/output/feeds/xiangpeng-hao.rss.xml
@@ -1,5 +1,43 @@
 <?xml version="1.0" encoding="utf-8"?>
-<rss version="2.0"><channel><title>Apache DataFusion Blog - Xiangpeng 
Hao</title><link>https://datafusion.apache.org/blog/</link><description></description><lastBuildDate>Thu,
 20 Mar 2025 00:00:00 +0000</lastBuildDate><item><title>Parquet Pruning in 
DataFusion: Read Only What 
Matters</title><link>https://datafusion.apache.org/blog/2025/03/20/parquet-pruning</link><description>&lt;!--
+<rss version="2.0"><channel><title>Apache DataFusion Blog - Xiangpeng 
Hao</title><link>https://datafusion.apache.org/blog/</link><description></description><lastBuildDate>Fri,
 21 Mar 2025 00:00:00 +0000</lastBuildDate><item><title>Efficient Filter 
Pushdown in 
Parquet</title><link>https://datafusion.apache.org/blog/2025/03/21/parquet-pushdown</link><description>&lt;style&gt;
+figure {
+  margin: 20px 0;
+}
+
+figure img {
+  display: block;
+  max-width: 80%;
+}
+
+figcaption {
+  font-style: italic;
+  margin-top: 10px;
+  color: #555;
+  font-size: 0.9em;
+  max-width: 80%;
+}
+&lt;/style&gt;
+&lt;!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+--&gt;
+&lt;p&gt;&lt;em&gt;Editor's Note: This blog was first published on &lt;a 
href="https://blog.xiangpeng.systems/posts/parquet-pushdown/"&gt;Xiangpeng 
Hao's blog&lt;/a&gt;. Thanks to &lt;a 
href="https://www.influxdata.com/"&gt;InfluxData&lt;/a&gt; for sponsoring this 
work as part of his PhD funding.&lt;/em&gt;&lt;/p&gt;
+&lt;hr/&gt;
+&lt;p&gt;In the &lt;a 
href="https://datafusion.apache.org/blog/2025/03/20/parquet-pruning"&gt;previous
 post …&lt;/a&gt;&lt;/p&gt;</description><dc:creator 
xmlns:dc="http://purl.org/dc/elements/1.1/";>Xiangpeng 
Hao</dc:creator><pubDate>Fri, 21 Mar 2025 00:00:00 +0000</pubDate><guid 
isPermaLink="false">tag:datafusion.apache.org,2025-03-21:/blog/2025/03/21/parquet-pushdown</guid><category>blog</category></item><item><title>Parquet
 Pruning in DataFusion: Read Only What Matters</title><link>ht [...]
 {% comment %}
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
diff --git a/output/images/parquet-pushdown/baseline-impl.jpg 
b/output/images/parquet-pushdown/baseline-impl.jpg
new file mode 100644
index 0000000..28f7e59
Binary files /dev/null and b/output/images/parquet-pushdown/baseline-impl.jpg 
differ
diff --git a/output/images/parquet-pushdown/cached-pages.jpg 
b/output/images/parquet-pushdown/cached-pages.jpg
new file mode 100644
index 0000000..0949544
Binary files /dev/null and b/output/images/parquet-pushdown/cached-pages.jpg 
differ
diff --git a/output/images/parquet-pushdown/new-pipeline.jpg 
b/output/images/parquet-pushdown/new-pipeline.jpg
new file mode 100644
index 0000000..93db49b
Binary files /dev/null and b/output/images/parquet-pushdown/new-pipeline.jpg 
differ
diff --git a/output/images/parquet-pushdown/parquet-viewer.jpg 
b/output/images/parquet-pushdown/parquet-viewer.jpg
new file mode 100644
index 0000000..71de38d
Binary files /dev/null and b/output/images/parquet-pushdown/parquet-viewer.jpg 
differ
diff --git a/output/images/parquet-pushdown/pushdown-vs-no-pushdown.jpg 
b/output/images/parquet-pushdown/pushdown-vs-no-pushdown.jpg
new file mode 100644
index 0000000..bedcf10
Binary files /dev/null and 
b/output/images/parquet-pushdown/pushdown-vs-no-pushdown.jpg differ
diff --git a/output/index.html b/output/index.html
index 9fecc28..ce50f7d 100644
--- a/output/index.html
+++ b/output/index.html
@@ -44,6 +44,63 @@
             <p><i>Here you can find the latest updates from DataFusion and 
related projects.</i></p>
 
 
+    <!-- Post -->
+    <div class="row">
+        <div class="callout">
+            <article class="post">
+                <header>
+                    <div class="title">
+                        <h1><a 
href="/blog/2025/03/21/parquet-pushdown">Efficient Filter Pushdown in 
Parquet</a></h1>
+                        <p>Posted on: Fri 21 March 2025 by Xiangpeng Hao</p>
+                        <p><style>
+figure {
+  margin: 20px 0;
+}
+
+figure img {
+  display: block;
+  max-width: 80%;
+}
+
+figcaption {
+  font-style: italic;
+  margin-top: 10px;
+  color: #555;
+  font-size: 0.9em;
+  max-width: 80%;
+}
+</style>
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+<p><em>Editor's Note: This blog was first published on <a 
href="https://blog.xiangpeng.systems/posts/parquet-pushdown/";>Xiangpeng Hao's 
blog</a>. Thanks to <a href="https://www.influxdata.com/";>InfluxData</a> for 
sponsoring this work as part of his PhD funding.</em></p>
+<hr/>
+<p>In the <a 
href="https://datafusion.apache.org/blog/2025/03/20/parquet-pruning";>previous 
post …</a></p></p>
+                        <footer>
+                            <ul class="actions">
+                                <div style="text-align: right"><a 
href="/blog/2025/03/21/parquet-pushdown" class="button medium">Continue 
Reading</a></div>
+                            </ul>
+                            <ul class="stats">
+                            </ul>
+                        </footer>
+            </article>
+        </div>
+    </div>
     <!-- Post -->
     <div class="row">
         <div class="callout">


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(datafusion-site) branch asf-site updated: Commit build products

Reply via email to