This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/datafusion-site.git


The following commit(s) were added to refs/heads/asf-staging by this push:
     new 7d51f2e  Commit build products
7d51f2e is described below

commit 7d51f2e28375cf21872fa8782e82a8e01649dc06
Author: Build Pelican (action) <[email protected]>
AuthorDate: Fri Nov 21 16:11:14 2025 +0000

    Commit build products
---
 blog/2025/11/25/datafusion-51.0.0/index.html | 27 ++++++++++-----------------
 blog/feeds/all-en.atom.xml                   | 23 ++++++++---------------
 blog/feeds/blog.atom.xml                     | 23 ++++++++---------------
 blog/feeds/pmc.atom.xml                      | 23 ++++++++---------------
 4 files changed, 34 insertions(+), 62 deletions(-)

diff --git a/blog/2025/11/25/datafusion-51.0.0/index.html 
b/blog/2025/11/25/datafusion-51.0.0/index.html
index 4b4de5d..45061cb 100644
--- a/blog/2025/11/25/datafusion-51.0.0/index.html
+++ b/blog/2025/11/25/datafusion-51.0.0/index.html
@@ -49,8 +49,8 @@
 <li><a href="#introduction">Introduction</a></li>
 <li><a href="#performance-improvements">Performance Improvements 🚀</a><ul>
 <li><a href="#faster-case-expression-evaluation">Faster CASE expression 
evaluation</a></li>
-<li><a href="#faster-parquet-metadata-parsing">Faster Parquet metadata 
parsing</a></li>
 <li><a href="#better-defaults-for-remote-parquet-reads">Better Defaults for 
Remote Parquet Reads</a></li>
+<li><a href="#faster-parquet-metadata-parsing">Faster Parquet metadata 
parsing</a></li>
 </ul>
 </li>
 <li><a href="#new-features">New Features ✨</a><ul>
@@ -107,13 +107,14 @@ Expressions short‑circuit earlier, reuse partial results, 
and avoid unnecessar
 scattering, speeding up common ETL patterns. Thanks to <a 
href="https://github.com/pepijnve";>pepijnve</a>, <a 
href="https://github.com/chenkovsky";>chenkovsky</a>,
 and <a href="https://github.com/petern48";>petern48</a> for leading this 
effort. We hope to share more details on our
 implementation in a future post.</p>
-<p><strong>Fewer object store round-trips for Parquet by default</strong></p>
-<p>DataFusion now sets a default <code>metadata_size_hint</code> for <a 
href="https://parquet.apache.org/";>Apache Parquet</a> scans
-(<a href="https://github.com/apache/datafusion/issues/18118";>#18118</a>), 
avoiding the extra
-“last 8‑byte” request many clouds require to read file footers. Remote scans
-typically drop from five requests to four per file, cutting latency and 
transfer
-costs without any application changes. Thanks to <a 
href="https://github.com/zhuqi-lucas";>zhuqi-lucas</a> for leading this
-effort.</p>
+<h3 id="better-defaults-for-remote-parquet-reads">Better Defaults for Remote 
Parquet Reads<a class="headerlink" 
href="#better-defaults-for-remote-parquet-reads" title="Permanent 
link">¶</a></h3>
+<p>By default, DataFusion now always fetches the last 512KB (configurable) of 
<a href="https://parquet.apache.org/";>Apache Parquet</a> files
+which usually includes the footer and metadata (<a 
href="https://github.com/apache/datafusion/issues/18118";>#18118</a>). This 
+change typically avoids 2 I/O requests for each Parquet. While this
+setting has existed in DataFusion for many years, it was not previously enabled
+by default. Users can tune the number of bytes fetched in the initial I/O
+request via the <code>datafusion.execution.parquet.metadata_size_hint</code> 
<a href="https://datafusion.apache.org/user-guide/configs.html";>config 
setting</a>. Thanks to
+<a href="https://github.com/zhuqi-lucas";>zhuqi-lucas</a> for leading this 
effort.</p>
 <h3 id="faster-parquet-metadata-parsing">Faster Parquet metadata parsing<a 
class="headerlink" href="#faster-parquet-metadata-parsing" title="Permanent 
link">¶</a></h3>
 <p>DataFusion 51 also includes the latest Parquet reader from
 <a href="https://arrow.apache.org/blog/2025/10/30/arrow-rs-57.0.0/";>Arrow Rust 
57.0.0</a>, which parses Parquet metadata significantly faster. This is
@@ -122,14 +123,6 @@ where startup time or low latency is important. You can 
read more about the upst
 <a href="https://github.com/etseidl";>etseidl</a> and <a 
href="https://github.com/jhorstmann";>jhorstmann</a> that enabled these 
improvements in the <a 
href="https://arrow.apache.org/blog/2025/10/23/rust-parquet-metadata/";>Faster 
Apache Parquet Footer Metadata Using a Custom Thrift Parser</a> blog.</p>
 <p><img alt="Metadata Parsing Performance Improvements in Arrow/Parquet 57" 
class="img-responsive" 
src="/blog/images/datafusion-51.0.0/arrow-57-metadata-parsing.png" 
width="100%"/></p>
 <p><strong>Figure 2</strong>: Metadata parsing performance improvements in 
Arrow/Parquet 57.0.0. </p>
-<h3 id="better-defaults-for-remote-parquet-reads">Better Defaults for Remote 
Parquet Reads<a class="headerlink" 
href="#better-defaults-for-remote-parquet-reads" title="Permanent 
link">¶</a></h3>
-<p>By default, DataFusion now fetches the last 512KB (configurable) of Parquet 
files
-so the first request usually includes the full footer (<a 
href="https://github.com/apache/datafusion/issues/18118";>#18118</a>). This will
-typically avoid two distinct I/O requests for each Parquet file. While this
-setting has existed in DataFusion for many years, it was not previously enabled
-by default. Users can tune the number of bytes fetched in the initial I/O
-request via the <code>datafusion.execution.parquet.metadata_size_hint</code> 
<a href="https://datafusion.apache.org/user-guide/configs.html";>config 
setting</a>. Thanks to
-<a href="https://github.com/zhuqi-lucas";>zhuqi-lucas</a> for leading this 
effort.</p>
 <h2 id="new-features">New Features ✨<a class="headerlink" href="#new-features" 
title="Permanent link">¶</a></h2>
 <h3 id="decimal32decimal64-support">Decimal32/Decimal64 support<a 
class="headerlink" href="#decimal32decimal64-support" title="Permanent 
link">¶</a></h3>
 <p>The new Arrow types <code>Decimal32</code> and <code>Decimal64</code> are 
now supported in DataFusion
@@ -314,8 +307,8 @@ can find out how to reach us on the <a 
href="https://datafusion.apache.org/contr
 <li><a href="#introduction">Introduction</a></li>
 <li><a href="#performance-improvements">Performance Improvements 🚀</a><ul>
 <li><a href="#faster-case-expression-evaluation">Faster CASE expression 
evaluation</a></li>
-<li><a href="#faster-parquet-metadata-parsing">Faster Parquet metadata 
parsing</a></li>
 <li><a href="#better-defaults-for-remote-parquet-reads">Better Defaults for 
Remote Parquet Reads</a></li>
+<li><a href="#faster-parquet-metadata-parsing">Faster Parquet metadata 
parsing</a></li>
 </ul>
 </li>
 <li><a href="#new-features">New Features ✨</a><ul>
diff --git a/blog/feeds/all-en.atom.xml b/blog/feeds/all-en.atom.xml
index 482cdf5..88c0c37 100644
--- a/blog/feeds/all-en.atom.xml
+++ b/blog/feeds/all-en.atom.xml
@@ -62,13 +62,14 @@ Expressions short‑circuit earlier, reuse partial results, 
and avoid unnecessar
 scattering, speeding up common ETL patterns. Thanks to &lt;a 
href="https://github.com/pepijnve"&gt;pepijnve&lt;/a&gt;, &lt;a 
href="https://github.com/chenkovsky"&gt;chenkovsky&lt;/a&gt;,
 and &lt;a href="https://github.com/petern48"&gt;petern48&lt;/a&gt; for leading 
this effort. We hope to share more details on our
 implementation in a future post.&lt;/p&gt;
-&lt;p&gt;&lt;strong&gt;Fewer object store round-trips for Parquet by 
default&lt;/strong&gt;&lt;/p&gt;
-&lt;p&gt;DataFusion now sets a default 
&lt;code&gt;metadata_size_hint&lt;/code&gt; for &lt;a 
href="https://parquet.apache.org/"&gt;Apache Parquet&lt;/a&gt; scans
-(&lt;a 
href="https://github.com/apache/datafusion/issues/18118"&gt;#18118&lt;/a&gt;), 
avoiding the extra
-“last 8‑byte” request many clouds require to read file footers. Remote scans
-typically drop from five requests to four per file, cutting latency and 
transfer
-costs without any application changes. Thanks to &lt;a 
href="https://github.com/zhuqi-lucas"&gt;zhuqi-lucas&lt;/a&gt; for leading this
-effort.&lt;/p&gt;
+&lt;h3 id="better-defaults-for-remote-parquet-reads"&gt;Better Defaults for 
Remote Parquet Reads&lt;a class="headerlink" 
href="#better-defaults-for-remote-parquet-reads" title="Permanent 
link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
+&lt;p&gt;By default, DataFusion now always fetches the last 512KB 
(configurable) of &lt;a href="https://parquet.apache.org/"&gt;Apache 
Parquet&lt;/a&gt; files
+which usually includes the footer and metadata (&lt;a 
href="https://github.com/apache/datafusion/issues/18118"&gt;#18118&lt;/a&gt;). 
This 
+change typically avoids 2 I/O requests for each Parquet. While this
+setting has existed in DataFusion for many years, it was not previously enabled
+by default. Users can tune the number of bytes fetched in the initial I/O
+request via the 
&lt;code&gt;datafusion.execution.parquet.metadata_size_hint&lt;/code&gt; &lt;a 
href="https://datafusion.apache.org/user-guide/configs.html"&gt;config 
setting&lt;/a&gt;. Thanks to
+&lt;a href="https://github.com/zhuqi-lucas"&gt;zhuqi-lucas&lt;/a&gt; for 
leading this effort.&lt;/p&gt;
 &lt;h3 id="faster-parquet-metadata-parsing"&gt;Faster Parquet metadata 
parsing&lt;a class="headerlink" href="#faster-parquet-metadata-parsing" 
title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
 &lt;p&gt;DataFusion 51 also includes the latest Parquet reader from
 &lt;a 
href="https://arrow.apache.org/blog/2025/10/30/arrow-rs-57.0.0/"&gt;Arrow Rust 
57.0.0&lt;/a&gt;, which parses Parquet metadata significantly faster. This is
@@ -77,14 +78,6 @@ where startup time or low latency is important. You can read 
more about the upst
 &lt;a href="https://github.com/etseidl"&gt;etseidl&lt;/a&gt; and &lt;a 
href="https://github.com/jhorstmann"&gt;jhorstmann&lt;/a&gt; that enabled these 
improvements in the &lt;a 
href="https://arrow.apache.org/blog/2025/10/23/rust-parquet-metadata/"&gt;Faster
 Apache Parquet Footer Metadata Using a Custom Thrift Parser&lt;/a&gt; 
blog.&lt;/p&gt;
 &lt;p&gt;&lt;img alt="Metadata Parsing Performance Improvements in 
Arrow/Parquet 57" class="img-responsive" 
src="/blog/images/datafusion-51.0.0/arrow-57-metadata-parsing.png" 
width="100%"/&gt;&lt;/p&gt;
 &lt;p&gt;&lt;strong&gt;Figure 2&lt;/strong&gt;: Metadata parsing performance 
improvements in Arrow/Parquet 57.0.0. &lt;/p&gt;
-&lt;h3 id="better-defaults-for-remote-parquet-reads"&gt;Better Defaults for 
Remote Parquet Reads&lt;a class="headerlink" 
href="#better-defaults-for-remote-parquet-reads" title="Permanent 
link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
-&lt;p&gt;By default, DataFusion now fetches the last 512KB (configurable) of 
Parquet files
-so the first request usually includes the full footer (&lt;a 
href="https://github.com/apache/datafusion/issues/18118"&gt;#18118&lt;/a&gt;). 
This will
-typically avoid two distinct I/O requests for each Parquet file. While this
-setting has existed in DataFusion for many years, it was not previously enabled
-by default. Users can tune the number of bytes fetched in the initial I/O
-request via the 
&lt;code&gt;datafusion.execution.parquet.metadata_size_hint&lt;/code&gt; &lt;a 
href="https://datafusion.apache.org/user-guide/configs.html"&gt;config 
setting&lt;/a&gt;. Thanks to
-&lt;a href="https://github.com/zhuqi-lucas"&gt;zhuqi-lucas&lt;/a&gt; for 
leading this effort.&lt;/p&gt;
 &lt;h2 id="new-features"&gt;New Features ✨&lt;a class="headerlink" 
href="#new-features" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
 &lt;h3 id="decimal32decimal64-support"&gt;Decimal32/Decimal64 support&lt;a 
class="headerlink" href="#decimal32decimal64-support" title="Permanent 
link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
 &lt;p&gt;The new Arrow types &lt;code&gt;Decimal32&lt;/code&gt; and 
&lt;code&gt;Decimal64&lt;/code&gt; are now supported in DataFusion
diff --git a/blog/feeds/blog.atom.xml b/blog/feeds/blog.atom.xml
index 8ae26c1..2f62d68 100644
--- a/blog/feeds/blog.atom.xml
+++ b/blog/feeds/blog.atom.xml
@@ -62,13 +62,14 @@ Expressions short‑circuit earlier, reuse partial results, 
and avoid unnecessar
 scattering, speeding up common ETL patterns. Thanks to &lt;a 
href="https://github.com/pepijnve"&gt;pepijnve&lt;/a&gt;, &lt;a 
href="https://github.com/chenkovsky"&gt;chenkovsky&lt;/a&gt;,
 and &lt;a href="https://github.com/petern48"&gt;petern48&lt;/a&gt; for leading 
this effort. We hope to share more details on our
 implementation in a future post.&lt;/p&gt;
-&lt;p&gt;&lt;strong&gt;Fewer object store round-trips for Parquet by 
default&lt;/strong&gt;&lt;/p&gt;
-&lt;p&gt;DataFusion now sets a default 
&lt;code&gt;metadata_size_hint&lt;/code&gt; for &lt;a 
href="https://parquet.apache.org/"&gt;Apache Parquet&lt;/a&gt; scans
-(&lt;a 
href="https://github.com/apache/datafusion/issues/18118"&gt;#18118&lt;/a&gt;), 
avoiding the extra
-“last 8‑byte” request many clouds require to read file footers. Remote scans
-typically drop from five requests to four per file, cutting latency and 
transfer
-costs without any application changes. Thanks to &lt;a 
href="https://github.com/zhuqi-lucas"&gt;zhuqi-lucas&lt;/a&gt; for leading this
-effort.&lt;/p&gt;
+&lt;h3 id="better-defaults-for-remote-parquet-reads"&gt;Better Defaults for 
Remote Parquet Reads&lt;a class="headerlink" 
href="#better-defaults-for-remote-parquet-reads" title="Permanent 
link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
+&lt;p&gt;By default, DataFusion now always fetches the last 512KB 
(configurable) of &lt;a href="https://parquet.apache.org/"&gt;Apache 
Parquet&lt;/a&gt; files
+which usually includes the footer and metadata (&lt;a 
href="https://github.com/apache/datafusion/issues/18118"&gt;#18118&lt;/a&gt;). 
This 
+change typically avoids 2 I/O requests for each Parquet. While this
+setting has existed in DataFusion for many years, it was not previously enabled
+by default. Users can tune the number of bytes fetched in the initial I/O
+request via the 
&lt;code&gt;datafusion.execution.parquet.metadata_size_hint&lt;/code&gt; &lt;a 
href="https://datafusion.apache.org/user-guide/configs.html"&gt;config 
setting&lt;/a&gt;. Thanks to
+&lt;a href="https://github.com/zhuqi-lucas"&gt;zhuqi-lucas&lt;/a&gt; for 
leading this effort.&lt;/p&gt;
 &lt;h3 id="faster-parquet-metadata-parsing"&gt;Faster Parquet metadata 
parsing&lt;a class="headerlink" href="#faster-parquet-metadata-parsing" 
title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
 &lt;p&gt;DataFusion 51 also includes the latest Parquet reader from
 &lt;a 
href="https://arrow.apache.org/blog/2025/10/30/arrow-rs-57.0.0/"&gt;Arrow Rust 
57.0.0&lt;/a&gt;, which parses Parquet metadata significantly faster. This is
@@ -77,14 +78,6 @@ where startup time or low latency is important. You can read 
more about the upst
 &lt;a href="https://github.com/etseidl"&gt;etseidl&lt;/a&gt; and &lt;a 
href="https://github.com/jhorstmann"&gt;jhorstmann&lt;/a&gt; that enabled these 
improvements in the &lt;a 
href="https://arrow.apache.org/blog/2025/10/23/rust-parquet-metadata/"&gt;Faster
 Apache Parquet Footer Metadata Using a Custom Thrift Parser&lt;/a&gt; 
blog.&lt;/p&gt;
 &lt;p&gt;&lt;img alt="Metadata Parsing Performance Improvements in 
Arrow/Parquet 57" class="img-responsive" 
src="/blog/images/datafusion-51.0.0/arrow-57-metadata-parsing.png" 
width="100%"/&gt;&lt;/p&gt;
 &lt;p&gt;&lt;strong&gt;Figure 2&lt;/strong&gt;: Metadata parsing performance 
improvements in Arrow/Parquet 57.0.0. &lt;/p&gt;
-&lt;h3 id="better-defaults-for-remote-parquet-reads"&gt;Better Defaults for 
Remote Parquet Reads&lt;a class="headerlink" 
href="#better-defaults-for-remote-parquet-reads" title="Permanent 
link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
-&lt;p&gt;By default, DataFusion now fetches the last 512KB (configurable) of 
Parquet files
-so the first request usually includes the full footer (&lt;a 
href="https://github.com/apache/datafusion/issues/18118"&gt;#18118&lt;/a&gt;). 
This will
-typically avoid two distinct I/O requests for each Parquet file. While this
-setting has existed in DataFusion for many years, it was not previously enabled
-by default. Users can tune the number of bytes fetched in the initial I/O
-request via the 
&lt;code&gt;datafusion.execution.parquet.metadata_size_hint&lt;/code&gt; &lt;a 
href="https://datafusion.apache.org/user-guide/configs.html"&gt;config 
setting&lt;/a&gt;. Thanks to
-&lt;a href="https://github.com/zhuqi-lucas"&gt;zhuqi-lucas&lt;/a&gt; for 
leading this effort.&lt;/p&gt;
 &lt;h2 id="new-features"&gt;New Features ✨&lt;a class="headerlink" 
href="#new-features" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
 &lt;h3 id="decimal32decimal64-support"&gt;Decimal32/Decimal64 support&lt;a 
class="headerlink" href="#decimal32decimal64-support" title="Permanent 
link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
 &lt;p&gt;The new Arrow types &lt;code&gt;Decimal32&lt;/code&gt; and 
&lt;code&gt;Decimal64&lt;/code&gt; are now supported in DataFusion
diff --git a/blog/feeds/pmc.atom.xml b/blog/feeds/pmc.atom.xml
index 7eefc86..d60f9ed 100644
--- a/blog/feeds/pmc.atom.xml
+++ b/blog/feeds/pmc.atom.xml
@@ -62,13 +62,14 @@ Expressions short‑circuit earlier, reuse partial results, 
and avoid unnecessar
 scattering, speeding up common ETL patterns. Thanks to &lt;a 
href="https://github.com/pepijnve"&gt;pepijnve&lt;/a&gt;, &lt;a 
href="https://github.com/chenkovsky"&gt;chenkovsky&lt;/a&gt;,
 and &lt;a href="https://github.com/petern48"&gt;petern48&lt;/a&gt; for leading 
this effort. We hope to share more details on our
 implementation in a future post.&lt;/p&gt;
-&lt;p&gt;&lt;strong&gt;Fewer object store round-trips for Parquet by 
default&lt;/strong&gt;&lt;/p&gt;
-&lt;p&gt;DataFusion now sets a default 
&lt;code&gt;metadata_size_hint&lt;/code&gt; for &lt;a 
href="https://parquet.apache.org/"&gt;Apache Parquet&lt;/a&gt; scans
-(&lt;a 
href="https://github.com/apache/datafusion/issues/18118"&gt;#18118&lt;/a&gt;), 
avoiding the extra
-“last 8‑byte” request many clouds require to read file footers. Remote scans
-typically drop from five requests to four per file, cutting latency and 
transfer
-costs without any application changes. Thanks to &lt;a 
href="https://github.com/zhuqi-lucas"&gt;zhuqi-lucas&lt;/a&gt; for leading this
-effort.&lt;/p&gt;
+&lt;h3 id="better-defaults-for-remote-parquet-reads"&gt;Better Defaults for 
Remote Parquet Reads&lt;a class="headerlink" 
href="#better-defaults-for-remote-parquet-reads" title="Permanent 
link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
+&lt;p&gt;By default, DataFusion now always fetches the last 512KB 
(configurable) of &lt;a href="https://parquet.apache.org/"&gt;Apache 
Parquet&lt;/a&gt; files
+which usually includes the footer and metadata (&lt;a 
href="https://github.com/apache/datafusion/issues/18118"&gt;#18118&lt;/a&gt;). 
This 
+change typically avoids 2 I/O requests for each Parquet. While this
+setting has existed in DataFusion for many years, it was not previously enabled
+by default. Users can tune the number of bytes fetched in the initial I/O
+request via the 
&lt;code&gt;datafusion.execution.parquet.metadata_size_hint&lt;/code&gt; &lt;a 
href="https://datafusion.apache.org/user-guide/configs.html"&gt;config 
setting&lt;/a&gt;. Thanks to
+&lt;a href="https://github.com/zhuqi-lucas"&gt;zhuqi-lucas&lt;/a&gt; for 
leading this effort.&lt;/p&gt;
 &lt;h3 id="faster-parquet-metadata-parsing"&gt;Faster Parquet metadata 
parsing&lt;a class="headerlink" href="#faster-parquet-metadata-parsing" 
title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
 &lt;p&gt;DataFusion 51 also includes the latest Parquet reader from
 &lt;a 
href="https://arrow.apache.org/blog/2025/10/30/arrow-rs-57.0.0/"&gt;Arrow Rust 
57.0.0&lt;/a&gt;, which parses Parquet metadata significantly faster. This is
@@ -77,14 +78,6 @@ where startup time or low latency is important. You can read 
more about the upst
 &lt;a href="https://github.com/etseidl"&gt;etseidl&lt;/a&gt; and &lt;a 
href="https://github.com/jhorstmann"&gt;jhorstmann&lt;/a&gt; that enabled these 
improvements in the &lt;a 
href="https://arrow.apache.org/blog/2025/10/23/rust-parquet-metadata/"&gt;Faster
 Apache Parquet Footer Metadata Using a Custom Thrift Parser&lt;/a&gt; 
blog.&lt;/p&gt;
 &lt;p&gt;&lt;img alt="Metadata Parsing Performance Improvements in 
Arrow/Parquet 57" class="img-responsive" 
src="/blog/images/datafusion-51.0.0/arrow-57-metadata-parsing.png" 
width="100%"/&gt;&lt;/p&gt;
 &lt;p&gt;&lt;strong&gt;Figure 2&lt;/strong&gt;: Metadata parsing performance 
improvements in Arrow/Parquet 57.0.0. &lt;/p&gt;
-&lt;h3 id="better-defaults-for-remote-parquet-reads"&gt;Better Defaults for 
Remote Parquet Reads&lt;a class="headerlink" 
href="#better-defaults-for-remote-parquet-reads" title="Permanent 
link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
-&lt;p&gt;By default, DataFusion now fetches the last 512KB (configurable) of 
Parquet files
-so the first request usually includes the full footer (&lt;a 
href="https://github.com/apache/datafusion/issues/18118"&gt;#18118&lt;/a&gt;). 
This will
-typically avoid two distinct I/O requests for each Parquet file. While this
-setting has existed in DataFusion for many years, it was not previously enabled
-by default. Users can tune the number of bytes fetched in the initial I/O
-request via the 
&lt;code&gt;datafusion.execution.parquet.metadata_size_hint&lt;/code&gt; &lt;a 
href="https://datafusion.apache.org/user-guide/configs.html"&gt;config 
setting&lt;/a&gt;. Thanks to
-&lt;a href="https://github.com/zhuqi-lucas"&gt;zhuqi-lucas&lt;/a&gt; for 
leading this effort.&lt;/p&gt;
 &lt;h2 id="new-features"&gt;New Features ✨&lt;a class="headerlink" 
href="#new-features" title="Permanent link"&gt;¶&lt;/a&gt;&lt;/h2&gt;
 &lt;h3 id="decimal32decimal64-support"&gt;Decimal32/Decimal64 support&lt;a 
class="headerlink" href="#decimal32decimal64-support" title="Permanent 
link"&gt;¶&lt;/a&gt;&lt;/h3&gt;
 &lt;p&gt;The new Arrow types &lt;code&gt;Decimal32&lt;/code&gt; and 
&lt;code&gt;Decimal64&lt;/code&gt; are now supported in DataFusion


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to