This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/datafusion-site.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 19a894e  Commit build products
19a894e is described below

commit 19a894e727cf40123aec8a400568b35d770fc1bb
Author: Build Pelican (action) <[email protected]>
AuthorDate: Thu Mar 20 22:38:29 2025 +0000

    Commit build products
---
 .../2025/03/20/datafusion-comet-0.7.0/index.html   | 140 +++++++++++++++++++++
 output/author/pmc.html                             |  40 ++++++
 output/category/blog.html                          |  40 ++++++
 output/feed.xml                                    |  23 +++-
 output/feeds/all-en.atom.xml                       | 102 ++++++++++++++-
 output/feeds/blog.atom.xml                         | 102 ++++++++++++++-
 output/feeds/pmc.atom.xml                          | 102 ++++++++++++++-
 output/feeds/pmc.rss.xml                           |  23 +++-
 output/images/comet-0.7.0/performance.png          | Bin 0 -> 34131 bytes
 output/index.html                                  |  40 ++++++
 10 files changed, 607 insertions(+), 5 deletions(-)

diff --git a/output/2025/03/20/datafusion-comet-0.7.0/index.html 
b/output/2025/03/20/datafusion-comet-0.7.0/index.html
new file mode 100644
index 0000000..92044f5
--- /dev/null
+++ b/output/2025/03/20/datafusion-comet-0.7.0/index.html
@@ -0,0 +1,140 @@
+<!doctype html>
+<html class="no-js" lang="en" dir="ltr">
+  <head>
+    <meta charset="utf-8">
+    <meta http-equiv="x-ua-compatible" content="ie=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <title>Apache DataFusion Comet 0.7.0 Release - Apache DataFusion 
Blog</title>
+<link href="/blog/css/bootstrap.min.css" rel="stylesheet">
+<link href="/blog/css/fontawesome.all.min.css" rel="stylesheet">
+<link href="/blog/css/headerlink.css" rel="stylesheet">
+<link href="/blog/highlight/default.min.css" rel="stylesheet">
+<script src="/blog/highlight/highlight.js"></script>
+<script>hljs.highlightAll();</script>  </head>
+  <body class="d-flex flex-column h-100">
+  <main class="flex-shrink-0">
+<!-- nav bar -->
+<nav class="navbar navbar-expand-lg navbar-dark bg-dark" aria-label="Fifth 
navbar example">
+    <div class="container-fluid">
+        <a class="navbar-brand" href="/blog"><img 
src="/blog/images/logo_original4x.png" style="height: 32px;"/> Apache 
DataFusion Blog</a>
+        <button class="navbar-toggler" type="button" data-bs-toggle="collapse" 
data-bs-target="#navbarADP" aria-controls="navbarADP" aria-expanded="false" 
aria-label="Toggle navigation">
+            <span class="navbar-toggler-icon"></span>
+        </button>
+
+        <div class="collapse navbar-collapse" id="navbarADP">
+            <ul class="navbar-nav me-auto mb-2 mb-lg-0">
+                <li class="nav-item">
+                    <a class="nav-link" href="/blog/about.html">About</a>
+                </li>
+                <li class="nav-item">
+                    <a class="nav-link" href="/blog/feed.xml">RSS</a>
+                </li>
+            </ul>
+        </div>
+    </div>
+</nav>    
+
+
+<!-- page contents -->
+<div id="contents">
+    <div class="bg-white p-5 rounded">
+        <div class="col-sm-8 mx-auto">
+          <h1>
+              Apache DataFusion Comet 0.7.0 Release
+          </h1>
+              <p>Posted on: Thu 20 March 2025 by pmc</p>
+              <!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+<p>The Apache DataFusion PMC is pleased to announce version 0.7.0 of the <a 
href="https://datafusion.apache.org/comet/";>Comet</a> subproject.</p>
+<p>Comet is an accelerator for Apache Spark that translates Spark physical 
plans to DataFusion physical plans for
+improved performance and efficiency without requiring any code changes.</p>
+<p>Comet runs on commodity hardware and aims to provide 100% compatibility 
with Apache Spark. Any operators or
+expressions that are not fully compatible will fall back to Spark unless 
explicitly enabled by the user. Refer
+to the <a 
href="https://datafusion.apache.org/comet/user-guide/compatibility.html";>compatibility
 guide</a> for more information.</p>
+<p>This release covers approximately four weeks of development work and is the 
result of merging 46 PRs from 11
+contributors. See the <a 
href="https://github.com/apache/datafusion-comet/blob/main/dev/changelog/0.7.0.md";>change
 log</a> for more information.</p>
+<h2>Release Highlights</h2>
+<h3>Performance</h3>
+<p>Comet 0.7.0 has improved performance compared to the previous release due 
to improvements in the native shuffle 
+implementation and performance improvements in DataFusion 46.</p>
+<p>For single-node TPC-H at 100 GB, Comet now delivers a <strong>greater than 
2x speedup</strong> compared to Spark using the same 
+CPU and RAM. Even with <strong>half the resources</strong>, Comet still 
provides a measurable performance improvement.</p>
+<p><img alt="Chart showing TPC-H benchmark results for Comet 0.7.0" 
class="img-responsive" src="/blog/images/comet-0.7.0/performance.png" 
width="100%"/></p>
+<p><em>These benchmarks were performed on a Linux workstation with PCIe 5, AMD 
7950X CPU (16 cores), 128 GB RAM, and data 
+stored locally in Parquet format on NVMe storage. Spark was running in 
Kubernetes with hard memory limits.</em></p>
+<h2>Shuffle Improvements</h2>
+<p>There are several improvements to shuffle in this release:</p>
+<ul>
+<li>When running in off-heap mode (which is the recommended approach), Comet 
was using the wrong memory allocator 
+  implementation for some types of shuffle operation, which could result in 
OOM rather than spilling to disk.</li>
+<li>The number of spill files is drastically reduced. In previous releases, 
each instance of ShuffleMapTask could 
+  potentially create a new spill file for each output partition each time that 
spill was invoked. Comet now creates 
+  a maximum of one spill file per output partition per instance of 
ShuffleMapTask, which is appended to in subsequent 
+  spills.</li>
+<li>There was a flaw in the memory accounting which resulted in Comet 
requesting approximately twice the amount of 
+  memory that was needed, resulting in premature spilling. This is now 
resolved.</li>
+<li>The metric for number of spilled bytes is now accurate. It was previously 
reporting invalid information.</li>
+</ul>
+<h2>Improved Hash Join Performance</h2>
+<p>When using the <code>spark.comet.exec.replaceSortMergeJoin</code> setting 
to replace sort-merge joins with hash joins, Comet 
+will now do a better job of picking the optimal build side. Thanks to <a 
href="https://github.com/hayman42";>@hayman42</a> for suggesting this, and 
thanks to the 
+<a href="https://github.com/apache/incubator-gluten/";>Apache 
Gluten(incubating)</a> project for the inspiration in implementing this 
feature.</p>
+<h2>Experimental Support for DataFusion&rsquo;s Parquet Scan</h2>
+<p>It is now possible to configure Comet to use DataFusion&rsquo;s Parquet 
reader instead of Comet&rsquo;s current Parquet reader. This 
+has the advantage of supporting complex types, and also has performance 
optimizations that are not present in Comet's 
+existing reader.</p>
+<p>Support should still be considered experimental, but most of Comet&rsquo;s 
unit tests are now passing with the new reader. 
+Known issues include handling of <code>INT96</code> timestamps and unsigned 
bytes and shorts.</p>
+<p>To enable DataFusion&rsquo;s Parquet reader, either set 
<code>spark.comet.scan.impl=native_datafusion</code> or set the environment 
+variable <code>COMET_PARQUET_SCAN_IMPL=native_datafusion</code>.</p>
+<h2>Complex Type Support</h2>
+<p>With DataFusion&rsquo;s Parquet reader enabled, there is now some early 
support for reading structs from Parquet. This is 
+not thoroughly tested yet. We would welcome additional testing from the 
community to help determine what is and isn&rsquo;t 
+working, as well as contributions to improve support for structs and other 
complex types. The tracking issue is 
+<a 
href="https://github.com/apache/datafusion-comet/issues/1043";>https://github.com/apache/datafusion-comet/issues/1043</a>.</p>
+<h2>Updates to supported Spark versions</h2>
+<ul>
+<li>Comet 0.7.0 is now tested against Spark 3.5.4 rather than 3.5.1</li>
+<li>This will be the last Comet release to support Spark 3.3.x</li>
+</ul>
+<h2>Improved Tuning Guide</h2>
+<p>The <a 
href="https://datafusion.apache.org/comet/user-guide/tuning.html";>Comet Tuning 
Guide</a> has been improved and now provides guidance on determining how much 
memory to allocate to 
+Comet.</p>
+<h2>Getting Involved</h2>
+<p>The Comet project welcomes new contributors. We use the same <a 
href="https://datafusion.apache.org/contributor-guide/communication.html#slack-and-discord";>Slack
 and Discord</a> channels as the main DataFusion
+project and have a weekly <a 
href="https://docs.google.com/document/d/1NBpkIAuU7O9h8Br5CbFksDhX-L9TyO9wmGLPMe0Plc8/edit?usp=sharing";>DataFusion
 video call</a>.</p>
+<p>The easiest way to get involved is to test Comet with your current Spark 
jobs and file issues for any bugs or
+performance regressions that you find. See the <a 
href="https://datafusion.apache.org/comet/user-guide/installation.html";>Getting 
Started</a> guide for instructions on downloading and installing
+Comet.</p>
+<p>There are also many <a 
href="https://github.com/apache/datafusion-comet/contribute";>good first 
issues</a> waiting for contributions.</p>
+        </div>
+      </div>
+    </div>    
+    <!-- footer -->
+    <div class="row">
+      <div class="large-12 medium-12 columns">
+        <p style="font-style: italic; font-size: 0.8rem; text-align: center;">
+          Copyright 2025, <a href="https://www.apache.org/";>The Apache 
Software Foundation</a>, Licensed under the <a 
href="https://www.apache.org/licenses/LICENSE-2.0";>Apache License, Version 
2.0</a>.<br/>
+          Apache&reg; and the Apache feather logo are trademarks of The Apache 
Software Foundation.
+        </p>
+      </div>
+    </div>
+    <script src="/blog/js/bootstrap.bundle.min.js"></script>  </main>
+  </body>
+</html>
diff --git a/output/author/pmc.html b/output/author/pmc.html
index b70432d..dc365a8 100644
--- a/output/author/pmc.html
+++ b/output/author/pmc.html
@@ -47,6 +47,46 @@
             <p><i>Here you can find the latest updates from DataFusion and 
related projects.</i></p>
 
 
+    <!-- Post -->
+    <div class="row">
+        <div class="callout">
+            <article class="post">
+                <header>
+                    <div class="title">
+                        <h1><a 
href="/blog/2025/03/20/datafusion-comet-0.7.0">Apache DataFusion Comet 0.7.0 
Release</a></h1>
+                        <p>Posted on: Thu 20 March 2025 by pmc</p>
+                        <p><!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+<p>The Apache DataFusion PMC is pleased to announce version 0.7.0 of the <a 
href="https://datafusion.apache.org/comet/";>Comet</a> subproject.</p>
+<p>Comet is an accelerator for Apache Spark that translates Spark physical 
plans to DataFusion physical plans for
+improved performance and efficiency without requiring any code changes.</p>
+<p>Comet runs on commodity hardware and aims to …</p></p>
+                        <footer>
+                            <ul class="actions">
+                                <div style="text-align: right"><a 
href="/blog/2025/03/20/datafusion-comet-0.7.0" class="button medium">Continue 
Reading</a></div>
+                            </ul>
+                            <ul class="stats">
+                            </ul>
+                        </footer>
+            </article>
+        </div>
+    </div>
     <!-- Post -->
     <div class="row">
         <div class="callout">
diff --git a/output/category/blog.html b/output/category/blog.html
index a911192..56f617b 100644
--- a/output/category/blog.html
+++ b/output/category/blog.html
@@ -47,6 +47,46 @@
             <p><i>Here you can find the latest updates from DataFusion and 
related projects.</i></p>
 
 
+    <!-- Post -->
+    <div class="row">
+        <div class="callout">
+            <article class="post">
+                <header>
+                    <div class="title">
+                        <h1><a 
href="/blog/2025/03/20/datafusion-comet-0.7.0">Apache DataFusion Comet 0.7.0 
Release</a></h1>
+                        <p>Posted on: Thu 20 March 2025 by pmc</p>
+                        <p><!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+<p>The Apache DataFusion PMC is pleased to announce version 0.7.0 of the <a 
href="https://datafusion.apache.org/comet/";>Comet</a> subproject.</p>
+<p>Comet is an accelerator for Apache Spark that translates Spark physical 
plans to DataFusion physical plans for
+improved performance and efficiency without requiring any code changes.</p>
+<p>Comet runs on commodity hardware and aims to …</p></p>
+                        <footer>
+                            <ul class="actions">
+                                <div style="text-align: right"><a 
href="/blog/2025/03/20/datafusion-comet-0.7.0" class="button medium">Continue 
Reading</a></div>
+                            </ul>
+                            <ul class="stats">
+                            </ul>
+                        </footer>
+            </article>
+        </div>
+    </div>
     <!-- Post -->
     <div class="row">
         <div class="callout">
diff --git a/output/feed.xml b/output/feed.xml
index e055437..fccf433 100644
--- a/output/feed.xml
+++ b/output/feed.xml
@@ -1,5 +1,26 @@
 <?xml version="1.0" encoding="utf-8"?>
-<rss version="2.0"><channel><title>Apache DataFusion 
Blog</title><link>https://datafusion.apache.org/blog/</link><description></description><lastBuildDate>Thu,
 20 Mar 2025 00:00:00 +0000</lastBuildDate><item><title>Parquet Pruning in 
DataFusion: Read Only What 
Matters</title><link>https://datafusion.apache.org/blog/2025/03/20/parquet-pruning</link><description>&lt;!--
+<rss version="2.0"><channel><title>Apache DataFusion 
Blog</title><link>https://datafusion.apache.org/blog/</link><description></description><lastBuildDate>Thu,
 20 Mar 2025 00:00:00 +0000</lastBuildDate><item><title>Apache DataFusion Comet 
0.7.0 
Release</title><link>https://datafusion.apache.org/blog/2025/03/20/datafusion-comet-0.7.0</link><description>&lt;!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+--&gt;
+&lt;p&gt;The Apache DataFusion PMC is pleased to announce version 0.7.0 of the 
&lt;a href="https://datafusion.apache.org/comet/"&gt;Comet&lt;/a&gt; 
subproject.&lt;/p&gt;
+&lt;p&gt;Comet is an accelerator for Apache Spark that translates Spark 
physical plans to DataFusion physical plans for
+improved performance and efficiency without requiring any code 
changes.&lt;/p&gt;
+&lt;p&gt;Comet runs on commodity hardware and aims to 
…&lt;/p&gt;</description><dc:creator 
xmlns:dc="http://purl.org/dc/elements/1.1/";>pmc</dc:creator><pubDate>Thu, 20 
Mar 2025 00:00:00 +0000</pubDate><guid 
isPermaLink="false">tag:datafusion.apache.org,2025-03-20:/blog/2025/03/20/datafusion-comet-0.7.0</guid><category>blog</category></item><item><title>Parquet
 Pruning in DataFusion: Read Only What 
Matters</title><link>https://datafusion.apache.org/blog/2025/03/20/parquet-pruning</link><d
 [...]
 {% comment %}
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
diff --git a/output/feeds/all-en.atom.xml b/output/feeds/all-en.atom.xml
index 119fa38..2dc033d 100644
--- a/output/feeds/all-en.atom.xml
+++ b/output/feeds/all-en.atom.xml
@@ -1,5 +1,105 @@
 <?xml version="1.0" encoding="utf-8"?>
-<feed xmlns="http://www.w3.org/2005/Atom";><title>Apache DataFusion 
Blog</title><link href="https://datafusion.apache.org/blog/"; 
rel="alternate"></link><link 
href="https://datafusion.apache.org/blog/feeds/all-en.atom.xml"; 
rel="self"></link><id>https://datafusion.apache.org/blog/</id><updated>2025-03-20T00:00:00+00:00</updated><subtitle></subtitle><entry><title>Parquet
 Pruning in DataFusion: Read Only What Matters</title><link 
href="https://datafusion.apache.org/blog/2025/03/20/parquet-pru [...]
+<feed xmlns="http://www.w3.org/2005/Atom";><title>Apache DataFusion 
Blog</title><link href="https://datafusion.apache.org/blog/"; 
rel="alternate"></link><link 
href="https://datafusion.apache.org/blog/feeds/all-en.atom.xml"; 
rel="self"></link><id>https://datafusion.apache.org/blog/</id><updated>2025-03-20T00:00:00+00:00</updated><subtitle></subtitle><entry><title>Apache
 DataFusion Comet 0.7.0 Release</title><link 
href="https://datafusion.apache.org/blog/2025/03/20/datafusion-comet-0.7.0"; rel 
[...]
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+--&gt;
+&lt;p&gt;The Apache DataFusion PMC is pleased to announce version 0.7.0 of the 
&lt;a href="https://datafusion.apache.org/comet/"&gt;Comet&lt;/a&gt; 
subproject.&lt;/p&gt;
+&lt;p&gt;Comet is an accelerator for Apache Spark that translates Spark 
physical plans to DataFusion physical plans for
+improved performance and efficiency without requiring any code 
changes.&lt;/p&gt;
+&lt;p&gt;Comet runs on commodity hardware and aims to 
…&lt;/p&gt;</summary><content type="html">&lt;!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+--&gt;
+&lt;p&gt;The Apache DataFusion PMC is pleased to announce version 0.7.0 of the 
&lt;a href="https://datafusion.apache.org/comet/"&gt;Comet&lt;/a&gt; 
subproject.&lt;/p&gt;
+&lt;p&gt;Comet is an accelerator for Apache Spark that translates Spark 
physical plans to DataFusion physical plans for
+improved performance and efficiency without requiring any code 
changes.&lt;/p&gt;
+&lt;p&gt;Comet runs on commodity hardware and aims to provide 100% 
compatibility with Apache Spark. Any operators or
+expressions that are not fully compatible will fall back to Spark unless 
explicitly enabled by the user. Refer
+to the &lt;a 
href="https://datafusion.apache.org/comet/user-guide/compatibility.html"&gt;compatibility
 guide&lt;/a&gt; for more information.&lt;/p&gt;
+&lt;p&gt;This release covers approximately four weeks of development work and 
is the result of merging 46 PRs from 11
+contributors. See the &lt;a 
href="https://github.com/apache/datafusion-comet/blob/main/dev/changelog/0.7.0.md"&gt;change
 log&lt;/a&gt; for more information.&lt;/p&gt;
+&lt;h2&gt;Release Highlights&lt;/h2&gt;
+&lt;h3&gt;Performance&lt;/h3&gt;
+&lt;p&gt;Comet 0.7.0 has improved performance compared to the previous release 
due to improvements in the native shuffle 
+implementation and performance improvements in DataFusion 46.&lt;/p&gt;
+&lt;p&gt;For single-node TPC-H at 100 GB, Comet now delivers a 
&lt;strong&gt;greater than 2x speedup&lt;/strong&gt; compared to Spark using 
the same 
+CPU and RAM. Even with &lt;strong&gt;half the resources&lt;/strong&gt;, Comet 
still provides a measurable performance improvement.&lt;/p&gt;
+&lt;p&gt;&lt;img alt="Chart showing TPC-H benchmark results for Comet 0.7.0" 
class="img-responsive" src="/blog/images/comet-0.7.0/performance.png" 
width="100%"/&gt;&lt;/p&gt;
+&lt;p&gt;&lt;em&gt;These benchmarks were performed on a Linux workstation with 
PCIe 5, AMD 7950X CPU (16 cores), 128 GB RAM, and data 
+stored locally in Parquet format on NVMe storage. Spark was running in 
Kubernetes with hard memory limits.&lt;/em&gt;&lt;/p&gt;
+&lt;h2&gt;Shuffle Improvements&lt;/h2&gt;
+&lt;p&gt;There are several improvements to shuffle in this release:&lt;/p&gt;
+&lt;ul&gt;
+&lt;li&gt;When running in off-heap mode (which is the recommended approach), 
Comet was using the wrong memory allocator 
+  implementation for some types of shuffle operation, which could result in 
OOM rather than spilling to disk.&lt;/li&gt;
+&lt;li&gt;The number of spill files is drastically reduced. In previous 
releases, each instance of ShuffleMapTask could 
+  potentially create a new spill file for each output partition each time that 
spill was invoked. Comet now creates 
+  a maximum of one spill file per output partition per instance of 
ShuffleMapTask, which is appended to in subsequent 
+  spills.&lt;/li&gt;
+&lt;li&gt;There was a flaw in the memory accounting which resulted in Comet 
requesting approximately twice the amount of 
+  memory that was needed, resulting in premature spilling. This is now 
resolved.&lt;/li&gt;
+&lt;li&gt;The metric for number of spilled bytes is now accurate. It was 
previously reporting invalid information.&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2&gt;Improved Hash Join Performance&lt;/h2&gt;
+&lt;p&gt;When using the 
&lt;code&gt;spark.comet.exec.replaceSortMergeJoin&lt;/code&gt; setting to 
replace sort-merge joins with hash joins, Comet 
+will now do a better job of picking the optimal build side. Thanks to &lt;a 
href="https://github.com/hayman42"&gt;@hayman42&lt;/a&gt; for suggesting this, 
and thanks to the 
+&lt;a href="https://github.com/apache/incubator-gluten/"&gt;Apache 
Gluten(incubating)&lt;/a&gt; project for the inspiration in implementing this 
feature.&lt;/p&gt;
+&lt;h2&gt;Experimental Support for DataFusion&amp;rsquo;s Parquet 
Scan&lt;/h2&gt;
+&lt;p&gt;It is now possible to configure Comet to use DataFusion&amp;rsquo;s 
Parquet reader instead of Comet&amp;rsquo;s current Parquet reader. This 
+has the advantage of supporting complex types, and also has performance 
optimizations that are not present in Comet's 
+existing reader.&lt;/p&gt;
+&lt;p&gt;Support should still be considered experimental, but most of 
Comet&amp;rsquo;s unit tests are now passing with the new reader. 
+Known issues include handling of &lt;code&gt;INT96&lt;/code&gt; timestamps and 
unsigned bytes and shorts.&lt;/p&gt;
+&lt;p&gt;To enable DataFusion&amp;rsquo;s Parquet reader, either set 
&lt;code&gt;spark.comet.scan.impl=native_datafusion&lt;/code&gt; or set the 
environment 
+variable 
&lt;code&gt;COMET_PARQUET_SCAN_IMPL=native_datafusion&lt;/code&gt;.&lt;/p&gt;
+&lt;h2&gt;Complex Type Support&lt;/h2&gt;
+&lt;p&gt;With DataFusion&amp;rsquo;s Parquet reader enabled, there is now some 
early support for reading structs from Parquet. This is 
+not thoroughly tested yet. We would welcome additional testing from the 
community to help determine what is and isn&amp;rsquo;t 
+working, as well as contributions to improve support for structs and other 
complex types. The tracking issue is 
+&lt;a 
href="https://github.com/apache/datafusion-comet/issues/1043"&gt;https://github.com/apache/datafusion-comet/issues/1043&lt;/a&gt;.&lt;/p&gt;
+&lt;h2&gt;Updates to supported Spark versions&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Comet 0.7.0 is now tested against Spark 3.5.4 rather than 
3.5.1&lt;/li&gt;
+&lt;li&gt;This will be the last Comet release to support Spark 3.3.x&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2&gt;Improved Tuning Guide&lt;/h2&gt;
+&lt;p&gt;The &lt;a 
href="https://datafusion.apache.org/comet/user-guide/tuning.html"&gt;Comet 
Tuning Guide&lt;/a&gt; has been improved and now provides guidance on 
determining how much memory to allocate to 
+Comet.&lt;/p&gt;
+&lt;h2&gt;Getting Involved&lt;/h2&gt;
+&lt;p&gt;The Comet project welcomes new contributors. We use the same &lt;a 
href="https://datafusion.apache.org/contributor-guide/communication.html#slack-and-discord"&gt;Slack
 and Discord&lt;/a&gt; channels as the main DataFusion
+project and have a weekly &lt;a 
href="https://docs.google.com/document/d/1NBpkIAuU7O9h8Br5CbFksDhX-L9TyO9wmGLPMe0Plc8/edit?usp=sharing"&gt;DataFusion
 video call&lt;/a&gt;.&lt;/p&gt;
+&lt;p&gt;The easiest way to get involved is to test Comet with your current 
Spark jobs and file issues for any bugs or
+performance regressions that you find. See the &lt;a 
href="https://datafusion.apache.org/comet/user-guide/installation.html"&gt;Getting
 Started&lt;/a&gt; guide for instructions on downloading and installing
+Comet.&lt;/p&gt;
+&lt;p&gt;There are also many &lt;a 
href="https://github.com/apache/datafusion-comet/contribute"&gt;good first 
issues&lt;/a&gt; waiting for contributions.&lt;/p&gt;</content><category 
term="blog"></category></entry><entry><title>Parquet Pruning in DataFusion: 
Read Only What Matters</title><link 
href="https://datafusion.apache.org/blog/2025/03/20/parquet-pruning"; 
rel="alternate"></link><published>2025-03-20T00:00:00+00:00</published><updated>2025-03-20T00:00:00+00:00</updated><author><name
 [...]
 {% comment %}
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
diff --git a/output/feeds/blog.atom.xml b/output/feeds/blog.atom.xml
index 63d79e3..5de58a8 100644
--- a/output/feeds/blog.atom.xml
+++ b/output/feeds/blog.atom.xml
@@ -1,5 +1,105 @@
 <?xml version="1.0" encoding="utf-8"?>
-<feed xmlns="http://www.w3.org/2005/Atom";><title>Apache DataFusion Blog - 
blog</title><link href="https://datafusion.apache.org/blog/"; 
rel="alternate"></link><link 
href="https://datafusion.apache.org/blog/feeds/blog.atom.xml"; 
rel="self"></link><id>https://datafusion.apache.org/blog/</id><updated>2025-03-20T00:00:00+00:00</updated><subtitle></subtitle><entry><title>Parquet
 Pruning in DataFusion: Read Only What Matters</title><link 
href="https://datafusion.apache.org/blog/2025/03/20/parque [...]
+<feed xmlns="http://www.w3.org/2005/Atom";><title>Apache DataFusion Blog - 
blog</title><link href="https://datafusion.apache.org/blog/"; 
rel="alternate"></link><link 
href="https://datafusion.apache.org/blog/feeds/blog.atom.xml"; 
rel="self"></link><id>https://datafusion.apache.org/blog/</id><updated>2025-03-20T00:00:00+00:00</updated><subtitle></subtitle><entry><title>Apache
 DataFusion Comet 0.7.0 Release</title><link 
href="https://datafusion.apache.org/blog/2025/03/20/datafusion-comet-0.7.0 [...]
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+--&gt;
+&lt;p&gt;The Apache DataFusion PMC is pleased to announce version 0.7.0 of the 
&lt;a href="https://datafusion.apache.org/comet/"&gt;Comet&lt;/a&gt; 
subproject.&lt;/p&gt;
+&lt;p&gt;Comet is an accelerator for Apache Spark that translates Spark 
physical plans to DataFusion physical plans for
+improved performance and efficiency without requiring any code 
changes.&lt;/p&gt;
+&lt;p&gt;Comet runs on commodity hardware and aims to 
…&lt;/p&gt;</summary><content type="html">&lt;!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+--&gt;
+&lt;p&gt;The Apache DataFusion PMC is pleased to announce version 0.7.0 of the 
&lt;a href="https://datafusion.apache.org/comet/"&gt;Comet&lt;/a&gt; 
subproject.&lt;/p&gt;
+&lt;p&gt;Comet is an accelerator for Apache Spark that translates Spark 
physical plans to DataFusion physical plans for
+improved performance and efficiency without requiring any code 
changes.&lt;/p&gt;
+&lt;p&gt;Comet runs on commodity hardware and aims to provide 100% 
compatibility with Apache Spark. Any operators or
+expressions that are not fully compatible will fall back to Spark unless 
explicitly enabled by the user. Refer
+to the &lt;a 
href="https://datafusion.apache.org/comet/user-guide/compatibility.html"&gt;compatibility
 guide&lt;/a&gt; for more information.&lt;/p&gt;
+&lt;p&gt;This release covers approximately four weeks of development work and 
is the result of merging 46 PRs from 11
+contributors. See the &lt;a 
href="https://github.com/apache/datafusion-comet/blob/main/dev/changelog/0.7.0.md"&gt;change
 log&lt;/a&gt; for more information.&lt;/p&gt;
+&lt;h2&gt;Release Highlights&lt;/h2&gt;
+&lt;h3&gt;Performance&lt;/h3&gt;
+&lt;p&gt;Comet 0.7.0 has improved performance compared to the previous release 
due to improvements in the native shuffle 
+implementation and performance improvements in DataFusion 46.&lt;/p&gt;
+&lt;p&gt;For single-node TPC-H at 100 GB, Comet now delivers a 
&lt;strong&gt;greater than 2x speedup&lt;/strong&gt; compared to Spark using 
the same 
+CPU and RAM. Even with &lt;strong&gt;half the resources&lt;/strong&gt;, Comet 
still provides a measurable performance improvement.&lt;/p&gt;
+&lt;p&gt;&lt;img alt="Chart showing TPC-H benchmark results for Comet 0.7.0" 
class="img-responsive" src="/blog/images/comet-0.7.0/performance.png" 
width="100%"/&gt;&lt;/p&gt;
+&lt;p&gt;&lt;em&gt;These benchmarks were performed on a Linux workstation with 
PCIe 5, AMD 7950X CPU (16 cores), 128 GB RAM, and data 
+stored locally in Parquet format on NVMe storage. Spark was running in 
Kubernetes with hard memory limits.&lt;/em&gt;&lt;/p&gt;
+&lt;h2&gt;Shuffle Improvements&lt;/h2&gt;
+&lt;p&gt;There are several improvements to shuffle in this release:&lt;/p&gt;
+&lt;ul&gt;
+&lt;li&gt;When running in off-heap mode (which is the recommended approach), 
Comet was using the wrong memory allocator 
+  implementation for some types of shuffle operation, which could result in 
OOM rather than spilling to disk.&lt;/li&gt;
+&lt;li&gt;The number of spill files is drastically reduced. In previous 
releases, each instance of ShuffleMapTask could 
+  potentially create a new spill file for each output partition each time that 
spill was invoked. Comet now creates 
+  a maximum of one spill file per output partition per instance of 
ShuffleMapTask, which is appended to in subsequent 
+  spills.&lt;/li&gt;
+&lt;li&gt;There was a flaw in the memory accounting which resulted in Comet 
requesting approximately twice the amount of 
+  memory that was needed, resulting in premature spilling. This is now 
resolved.&lt;/li&gt;
+&lt;li&gt;The metric for number of spilled bytes is now accurate. It was 
previously reporting invalid information.&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2&gt;Improved Hash Join Performance&lt;/h2&gt;
+&lt;p&gt;When using the 
&lt;code&gt;spark.comet.exec.replaceSortMergeJoin&lt;/code&gt; setting to 
replace sort-merge joins with hash joins, Comet 
+will now do a better job of picking the optimal build side. Thanks to &lt;a 
href="https://github.com/hayman42"&gt;@hayman42&lt;/a&gt; for suggesting this, 
and thanks to the 
+&lt;a href="https://github.com/apache/incubator-gluten/"&gt;Apache 
Gluten(incubating)&lt;/a&gt; project for the inspiration in implementing this 
feature.&lt;/p&gt;
+&lt;h2&gt;Experimental Support for DataFusion&amp;rsquo;s Parquet 
Scan&lt;/h2&gt;
+&lt;p&gt;It is now possible to configure Comet to use DataFusion&amp;rsquo;s 
Parquet reader instead of Comet&amp;rsquo;s current Parquet reader. This 
+has the advantage of supporting complex types, and also has performance 
optimizations that are not present in Comet's 
+existing reader.&lt;/p&gt;
+&lt;p&gt;Support should still be considered experimental, but most of 
Comet&amp;rsquo;s unit tests are now passing with the new reader. 
+Known issues include handling of &lt;code&gt;INT96&lt;/code&gt; timestamps and 
unsigned bytes and shorts.&lt;/p&gt;
+&lt;p&gt;To enable DataFusion&amp;rsquo;s Parquet reader, either set 
&lt;code&gt;spark.comet.scan.impl=native_datafusion&lt;/code&gt; or set the 
environment 
+variable 
&lt;code&gt;COMET_PARQUET_SCAN_IMPL=native_datafusion&lt;/code&gt;.&lt;/p&gt;
+&lt;h2&gt;Complex Type Support&lt;/h2&gt;
+&lt;p&gt;With DataFusion&amp;rsquo;s Parquet reader enabled, there is now some 
early support for reading structs from Parquet. This is 
+not thoroughly tested yet. We would welcome additional testing from the 
community to help determine what is and isn&amp;rsquo;t 
+working, as well as contributions to improve support for structs and other 
complex types. The tracking issue is 
+&lt;a 
href="https://github.com/apache/datafusion-comet/issues/1043"&gt;https://github.com/apache/datafusion-comet/issues/1043&lt;/a&gt;.&lt;/p&gt;
+&lt;h2&gt;Updates to supported Spark versions&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Comet 0.7.0 is now tested against Spark 3.5.4 rather than 
3.5.1&lt;/li&gt;
+&lt;li&gt;This will be the last Comet release to support Spark 3.3.x&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2&gt;Improved Tuning Guide&lt;/h2&gt;
+&lt;p&gt;The &lt;a 
href="https://datafusion.apache.org/comet/user-guide/tuning.html"&gt;Comet 
Tuning Guide&lt;/a&gt; has been improved and now provides guidance on 
determining how much memory to allocate to 
+Comet.&lt;/p&gt;
+&lt;h2&gt;Getting Involved&lt;/h2&gt;
+&lt;p&gt;The Comet project welcomes new contributors. We use the same &lt;a 
href="https://datafusion.apache.org/contributor-guide/communication.html#slack-and-discord"&gt;Slack
 and Discord&lt;/a&gt; channels as the main DataFusion
+project and have a weekly &lt;a 
href="https://docs.google.com/document/d/1NBpkIAuU7O9h8Br5CbFksDhX-L9TyO9wmGLPMe0Plc8/edit?usp=sharing"&gt;DataFusion
 video call&lt;/a&gt;.&lt;/p&gt;
+&lt;p&gt;The easiest way to get involved is to test Comet with your current 
Spark jobs and file issues for any bugs or
+performance regressions that you find. See the &lt;a 
href="https://datafusion.apache.org/comet/user-guide/installation.html"&gt;Getting
 Started&lt;/a&gt; guide for instructions on downloading and installing
+Comet.&lt;/p&gt;
+&lt;p&gt;There are also many &lt;a 
href="https://github.com/apache/datafusion-comet/contribute"&gt;good first 
issues&lt;/a&gt; waiting for contributions.&lt;/p&gt;</content><category 
term="blog"></category></entry><entry><title>Parquet Pruning in DataFusion: 
Read Only What Matters</title><link 
href="https://datafusion.apache.org/blog/2025/03/20/parquet-pruning"; 
rel="alternate"></link><published>2025-03-20T00:00:00+00:00</published><updated>2025-03-20T00:00:00+00:00</updated><author><name
 [...]
 {% comment %}
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
diff --git a/output/feeds/pmc.atom.xml b/output/feeds/pmc.atom.xml
index 985a30c..9bbf15c 100644
--- a/output/feeds/pmc.atom.xml
+++ b/output/feeds/pmc.atom.xml
@@ -1,5 +1,105 @@
 <?xml version="1.0" encoding="utf-8"?>
-<feed xmlns="http://www.w3.org/2005/Atom";><title>Apache DataFusion Blog - 
pmc</title><link href="https://datafusion.apache.org/blog/"; 
rel="alternate"></link><link 
href="https://datafusion.apache.org/blog/feeds/pmc.atom.xml"; 
rel="self"></link><id>https://datafusion.apache.org/blog/</id><updated>2025-02-20T00:00:00+00:00</updated><subtitle></subtitle><entry><title>Apache
 DataFusion 45.0.0 Released</title><link 
href="https://datafusion.apache.org/blog/2025/02/20/datafusion-45.0.0"; 
rel="alte [...]
+<feed xmlns="http://www.w3.org/2005/Atom";><title>Apache DataFusion Blog - 
pmc</title><link href="https://datafusion.apache.org/blog/"; 
rel="alternate"></link><link 
href="https://datafusion.apache.org/blog/feeds/pmc.atom.xml"; 
rel="self"></link><id>https://datafusion.apache.org/blog/</id><updated>2025-03-20T00:00:00+00:00</updated><subtitle></subtitle><entry><title>Apache
 DataFusion Comet 0.7.0 Release</title><link 
href="https://datafusion.apache.org/blog/2025/03/20/datafusion-comet-0.7.0";  
[...]
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+--&gt;
+&lt;p&gt;The Apache DataFusion PMC is pleased to announce version 0.7.0 of the 
&lt;a href="https://datafusion.apache.org/comet/"&gt;Comet&lt;/a&gt; 
subproject.&lt;/p&gt;
+&lt;p&gt;Comet is an accelerator for Apache Spark that translates Spark 
physical plans to DataFusion physical plans for
+improved performance and efficiency without requiring any code 
changes.&lt;/p&gt;
+&lt;p&gt;Comet runs on commodity hardware and aims to 
…&lt;/p&gt;</summary><content type="html">&lt;!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+--&gt;
+&lt;p&gt;The Apache DataFusion PMC is pleased to announce version 0.7.0 of the 
&lt;a href="https://datafusion.apache.org/comet/"&gt;Comet&lt;/a&gt; 
subproject.&lt;/p&gt;
+&lt;p&gt;Comet is an accelerator for Apache Spark that translates Spark 
physical plans to DataFusion physical plans for
+improved performance and efficiency without requiring any code 
changes.&lt;/p&gt;
+&lt;p&gt;Comet runs on commodity hardware and aims to provide 100% 
compatibility with Apache Spark. Any operators or
+expressions that are not fully compatible will fall back to Spark unless 
explicitly enabled by the user. Refer
+to the &lt;a 
href="https://datafusion.apache.org/comet/user-guide/compatibility.html"&gt;compatibility
 guide&lt;/a&gt; for more information.&lt;/p&gt;
+&lt;p&gt;This release covers approximately four weeks of development work and 
is the result of merging 46 PRs from 11
+contributors. See the &lt;a 
href="https://github.com/apache/datafusion-comet/blob/main/dev/changelog/0.7.0.md"&gt;change
 log&lt;/a&gt; for more information.&lt;/p&gt;
+&lt;h2&gt;Release Highlights&lt;/h2&gt;
+&lt;h3&gt;Performance&lt;/h3&gt;
+&lt;p&gt;Comet 0.7.0 has improved performance compared to the previous release 
due to improvements in the native shuffle 
+implementation and performance improvements in DataFusion 46.&lt;/p&gt;
+&lt;p&gt;For single-node TPC-H at 100 GB, Comet now delivers a 
&lt;strong&gt;greater than 2x speedup&lt;/strong&gt; compared to Spark using 
the same 
+CPU and RAM. Even with &lt;strong&gt;half the resources&lt;/strong&gt;, Comet 
still provides a measurable performance improvement.&lt;/p&gt;
+&lt;p&gt;&lt;img alt="Chart showing TPC-H benchmark results for Comet 0.7.0" 
class="img-responsive" src="/blog/images/comet-0.7.0/performance.png" 
width="100%"/&gt;&lt;/p&gt;
+&lt;p&gt;&lt;em&gt;These benchmarks were performed on a Linux workstation with 
PCIe 5, AMD 7950X CPU (16 cores), 128 GB RAM, and data 
+stored locally in Parquet format on NVMe storage. Spark was running in 
Kubernetes with hard memory limits.&lt;/em&gt;&lt;/p&gt;
+&lt;h2&gt;Shuffle Improvements&lt;/h2&gt;
+&lt;p&gt;There are several improvements to shuffle in this release:&lt;/p&gt;
+&lt;ul&gt;
+&lt;li&gt;When running in off-heap mode (which is the recommended approach), 
Comet was using the wrong memory allocator 
+  implementation for some types of shuffle operation, which could result in 
OOM rather than spilling to disk.&lt;/li&gt;
+&lt;li&gt;The number of spill files is drastically reduced. In previous 
releases, each instance of ShuffleMapTask could 
+  potentially create a new spill file for each output partition each time that 
spill was invoked. Comet now creates 
+  a maximum of one spill file per output partition per instance of 
ShuffleMapTask, which is appended to in subsequent 
+  spills.&lt;/li&gt;
+&lt;li&gt;There was a flaw in the memory accounting which resulted in Comet 
requesting approximately twice the amount of 
+  memory that was needed, resulting in premature spilling. This is now 
resolved.&lt;/li&gt;
+&lt;li&gt;The metric for number of spilled bytes is now accurate. It was 
previously reporting invalid information.&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2&gt;Improved Hash Join Performance&lt;/h2&gt;
+&lt;p&gt;When using the 
&lt;code&gt;spark.comet.exec.replaceSortMergeJoin&lt;/code&gt; setting to 
replace sort-merge joins with hash joins, Comet 
+will now do a better job of picking the optimal build side. Thanks to &lt;a 
href="https://github.com/hayman42"&gt;@hayman42&lt;/a&gt; for suggesting this, 
and thanks to the 
+&lt;a href="https://github.com/apache/incubator-gluten/"&gt;Apache 
Gluten(incubating)&lt;/a&gt; project for the inspiration in implementing this 
feature.&lt;/p&gt;
+&lt;h2&gt;Experimental Support for DataFusion&amp;rsquo;s Parquet 
Scan&lt;/h2&gt;
+&lt;p&gt;It is now possible to configure Comet to use DataFusion&amp;rsquo;s 
Parquet reader instead of Comet&amp;rsquo;s current Parquet reader. This 
+has the advantage of supporting complex types, and also has performance 
optimizations that are not present in Comet's 
+existing reader.&lt;/p&gt;
+&lt;p&gt;Support should still be considered experimental, but most of 
Comet&amp;rsquo;s unit tests are now passing with the new reader. 
+Known issues include handling of &lt;code&gt;INT96&lt;/code&gt; timestamps and 
unsigned bytes and shorts.&lt;/p&gt;
+&lt;p&gt;To enable DataFusion&amp;rsquo;s Parquet reader, either set 
&lt;code&gt;spark.comet.scan.impl=native_datafusion&lt;/code&gt; or set the 
environment 
+variable 
&lt;code&gt;COMET_PARQUET_SCAN_IMPL=native_datafusion&lt;/code&gt;.&lt;/p&gt;
+&lt;h2&gt;Complex Type Support&lt;/h2&gt;
+&lt;p&gt;With DataFusion&amp;rsquo;s Parquet reader enabled, there is now some 
early support for reading structs from Parquet. This is 
+not thoroughly tested yet. We would welcome additional testing from the 
community to help determine what is and isn&amp;rsquo;t 
+working, as well as contributions to improve support for structs and other 
complex types. The tracking issue is 
+&lt;a 
href="https://github.com/apache/datafusion-comet/issues/1043"&gt;https://github.com/apache/datafusion-comet/issues/1043&lt;/a&gt;.&lt;/p&gt;
+&lt;h2&gt;Updates to supported Spark versions&lt;/h2&gt;
+&lt;ul&gt;
+&lt;li&gt;Comet 0.7.0 is now tested against Spark 3.5.4 rather than 
3.5.1&lt;/li&gt;
+&lt;li&gt;This will be the last Comet release to support Spark 3.3.x&lt;/li&gt;
+&lt;/ul&gt;
+&lt;h2&gt;Improved Tuning Guide&lt;/h2&gt;
+&lt;p&gt;The &lt;a 
href="https://datafusion.apache.org/comet/user-guide/tuning.html"&gt;Comet 
Tuning Guide&lt;/a&gt; has been improved and now provides guidance on 
determining how much memory to allocate to 
+Comet.&lt;/p&gt;
+&lt;h2&gt;Getting Involved&lt;/h2&gt;
+&lt;p&gt;The Comet project welcomes new contributors. We use the same &lt;a 
href="https://datafusion.apache.org/contributor-guide/communication.html#slack-and-discord"&gt;Slack
 and Discord&lt;/a&gt; channels as the main DataFusion
+project and have a weekly &lt;a 
href="https://docs.google.com/document/d/1NBpkIAuU7O9h8Br5CbFksDhX-L9TyO9wmGLPMe0Plc8/edit?usp=sharing"&gt;DataFusion
 video call&lt;/a&gt;.&lt;/p&gt;
+&lt;p&gt;The easiest way to get involved is to test Comet with your current 
Spark jobs and file issues for any bugs or
+performance regressions that you find. See the &lt;a 
href="https://datafusion.apache.org/comet/user-guide/installation.html"&gt;Getting
 Started&lt;/a&gt; guide for instructions on downloading and installing
+Comet.&lt;/p&gt;
+&lt;p&gt;There are also many &lt;a 
href="https://github.com/apache/datafusion-comet/contribute"&gt;good first 
issues&lt;/a&gt; waiting for contributions.&lt;/p&gt;</content><category 
term="blog"></category></entry><entry><title>Apache DataFusion 45.0.0 
Released</title><link 
href="https://datafusion.apache.org/blog/2025/02/20/datafusion-45.0.0"; 
rel="alternate"></link><published>2025-02-20T00:00:00+00:00</published><updated>2025-02-20T00:00:00+00:00</updated><author><name>pmc</name></autho
 [...]
 {% comment %}
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
diff --git a/output/feeds/pmc.rss.xml b/output/feeds/pmc.rss.xml
index 7620f23..13aca78 100644
--- a/output/feeds/pmc.rss.xml
+++ b/output/feeds/pmc.rss.xml
@@ -1,5 +1,26 @@
 <?xml version="1.0" encoding="utf-8"?>
-<rss version="2.0"><channel><title>Apache DataFusion Blog - 
pmc</title><link>https://datafusion.apache.org/blog/</link><description></description><lastBuildDate>Thu,
 20 Feb 2025 00:00:00 +0000</lastBuildDate><item><title>Apache DataFusion 
45.0.0 
Released</title><link>https://datafusion.apache.org/blog/2025/02/20/datafusion-45.0.0</link><description>&lt;!--
+<rss version="2.0"><channel><title>Apache DataFusion Blog - 
pmc</title><link>https://datafusion.apache.org/blog/</link><description></description><lastBuildDate>Thu,
 20 Mar 2025 00:00:00 +0000</lastBuildDate><item><title>Apache DataFusion Comet 
0.7.0 
Release</title><link>https://datafusion.apache.org/blog/2025/03/20/datafusion-comet-0.7.0</link><description>&lt;!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+--&gt;
+&lt;p&gt;The Apache DataFusion PMC is pleased to announce version 0.7.0 of the 
&lt;a href="https://datafusion.apache.org/comet/"&gt;Comet&lt;/a&gt; 
subproject.&lt;/p&gt;
+&lt;p&gt;Comet is an accelerator for Apache Spark that translates Spark 
physical plans to DataFusion physical plans for
+improved performance and efficiency without requiring any code 
changes.&lt;/p&gt;
+&lt;p&gt;Comet runs on commodity hardware and aims to 
…&lt;/p&gt;</description><dc:creator 
xmlns:dc="http://purl.org/dc/elements/1.1/";>pmc</dc:creator><pubDate>Thu, 20 
Mar 2025 00:00:00 +0000</pubDate><guid 
isPermaLink="false">tag:datafusion.apache.org,2025-03-20:/blog/2025/03/20/datafusion-comet-0.7.0</guid><category>blog</category></item><item><title>Apache
 DataFusion 45.0.0 
Released</title><link>https://datafusion.apache.org/blog/2025/02/20/datafusion-45.0.0</link><description>&lt;!--
 {% comment %}
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
diff --git a/output/images/comet-0.7.0/performance.png 
b/output/images/comet-0.7.0/performance.png
new file mode 100644
index 0000000..0338575
Binary files /dev/null and b/output/images/comet-0.7.0/performance.png differ
diff --git a/output/index.html b/output/index.html
index 744f383..9fecc28 100644
--- a/output/index.html
+++ b/output/index.html
@@ -44,6 +44,46 @@
             <p><i>Here you can find the latest updates from DataFusion and 
related projects.</i></p>
 
 
+    <!-- Post -->
+    <div class="row">
+        <div class="callout">
+            <article class="post">
+                <header>
+                    <div class="title">
+                        <h1><a 
href="/blog/2025/03/20/datafusion-comet-0.7.0">Apache DataFusion Comet 0.7.0 
Release</a></h1>
+                        <p>Posted on: Thu 20 March 2025 by pmc</p>
+                        <p><!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+<p>The Apache DataFusion PMC is pleased to announce version 0.7.0 of the <a 
href="https://datafusion.apache.org/comet/";>Comet</a> subproject.</p>
+<p>Comet is an accelerator for Apache Spark that translates Spark physical 
plans to DataFusion physical plans for
+improved performance and efficiency without requiring any code changes.</p>
+<p>Comet runs on commodity hardware and aims to …</p></p>
+                        <footer>
+                            <ul class="actions">
+                                <div style="text-align: right"><a 
href="/blog/2025/03/20/datafusion-comet-0.7.0" class="button medium">Continue 
Reading</a></div>
+                            </ul>
+                            <ul class="stats">
+                            </ul>
+                        </footer>
+            </article>
+        </div>
+    </div>
     <!-- Post -->
     <div class="row">
         <div class="callout">


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to