This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch asf-staging in repository https://gitbox.apache.org/repos/asf/datafusion-site.git
The following commit(s) were added to refs/heads/asf-staging by this push: new c4d57fc Commit build products c4d57fc is described below commit c4d57fcb0b4680bbe3bf88c124b79dfec8dda296 Author: Build Pelican (action) <priv...@infra.apache.org> AuthorDate: Mon Sep 15 13:52:26 2025 +0000 Commit build products --- blog/2025/09/13/datafusion-comet-0.10.0/index.html | 58 +++++++++++++++++----- blog/author/pmc.html | 6 +-- blog/category/blog.html | 6 +-- blog/feed.xml | 6 +-- blog/feeds/all-en.atom.xml | 30 +++++------ blog/feeds/blog.atom.xml | 30 +++++------ blog/feeds/pmc.atom.xml | 30 +++++------ blog/feeds/pmc.rss.xml | 6 +-- blog/index.html | 6 +-- 9 files changed, 106 insertions(+), 72 deletions(-) diff --git a/blog/2025/09/13/datafusion-comet-0.10.0/index.html b/blog/2025/09/13/datafusion-comet-0.10.0/index.html index fcdc876..0e0e377 100644 --- a/blog/2025/09/13/datafusion-comet-0.10.0/index.html +++ b/blog/2025/09/13/datafusion-comet-0.10.0/index.html @@ -44,6 +44,23 @@ </h1> <p>Posted on: Sat 13 September 2025 by pmc</p> + <aside class="d-md-none mb-2"> + <div class="toc"><span class="toctitle">Contents</span><ul> +<li><a href="#release-highlights">Release Highlights</a><ul> +<li><a href="#improved-support-for-apache-iceberg">Improved Support for Apache Iceberg</a></li> +<li><a href="#improved-spark-400-support">Improved Spark 4.0.0 Support</a></li> +<li><a href="#new-functionality">New Functionality</a></li> +<li><a href="#ux-improvements">UX Improvements</a></li> +<li><a href="#bug-fixes">Bug Fixes</a></li> +<li><a href="#benchmarking">Benchmarking</a></li> +<li><a href="#documentation-updates">Documentation Updates</a></li> +<li><a href="#spark-compatibility">Spark Compatibility</a></li> +</ul> +</li> +<li><a href="#getting-involved">Getting Involved</a></li> +</ul> +</div> + </aside> <!-- {% comment %} @@ -63,22 +80,22 @@ See the License for the specific language governing permissions and limitations under the License. {% endcomment %} --> -<p>[TOC] -The Apache DataFusion PMC is pleased to announce version 0.10.0 of the <a href="https://datafusion.apache.org/comet/">Comet</a> subproject.</p> + +<p>The Apache DataFusion PMC is pleased to announce version 0.10.0 of the <a href="https://datafusion.apache.org/comet/">Comet</a> subproject.</p> <p>Comet is an accelerator for Apache Spark that translates Spark physical plans to DataFusion physical plans for improved performance and efficiency without requiring any code changes.</p> <p>This release covers approximately ten weeks of development work and is the result of merging 183 PRs from 26 contributors. See the <a href="https://github.com/apache/datafusion-comet/blob/main/dev/changelog/0.10.0.md">change log</a> for more information.</p> -<h2 id="release-highlights">Release Highlights<a class="headerlink" href="#release-highlights" title="Permanent link">¶</a></h2> -<h3 id="improved-support-for-apache-iceberg">Improved Support for Apache Iceberg<a class="headerlink" href="#improved-support-for-apache-iceberg" title="Permanent link">¶</a></h3> +<h2 id="release-highlights">Release Highlights<a class="headerlink" href="#release-highlights" title="Permanent link">¶</a></h2> +<h3 id="improved-support-for-apache-iceberg">Improved Support for Apache Iceberg<a class="headerlink" href="#improved-support-for-apache-iceberg" title="Permanent link">¶</a></h3> <p>It is now possible to use Comet with Apache Iceberg 1.8.1 to accelerate reads of Iceberg Parquet tables.</p> -<h3 id="improved-spark-400-support">Improved Spark 4.0.0 Support<a class="headerlink" href="#improved-spark-400-support" title="Permanent link">¶</a></h3> +<h3 id="improved-spark-400-support">Improved Spark 4.0.0 Support<a class="headerlink" href="#improved-spark-400-support" title="Permanent link">¶</a></h3> <p>Comet no longer falls back to Spark for all queries when ANSI mode is enabled (which is the default in Spark 4.0.0). Instead, Comet will now only fall back to Spark for arithmetic and aggregates expressions that support ANSI mode.</p> <p>Setting <code>spark.comet.ansi.ignore=true</code> will override this behavior and force these expressions to continue to be accelerated by Comet. Full support for ANSI mode will be available in a future release.</p> <p>Comet will now use the <code>native_iceberg_compat</code> scan for Spark 4.0.0 in most cases, which supports reading complex types.</p> -<h3 id="new-functionality">New Functionality<a class="headerlink" href="#new-functionality" title="Permanent link">¶</a></h3> +<h3 id="new-functionality">New Functionality<a class="headerlink" href="#new-functionality" title="Permanent link">¶</a></h3> <p>The following SQL functions are now supported:</p> <ul> <li><code>array_min</code></li> @@ -99,23 +116,23 @@ accelerated by Comet. Full support for ANSI mode will be available in a future r <li>Support for array literals</li> <li>Support for limit with offset</li> </ul> -<h3 id="ux-improvements">UX Improvements<a class="headerlink" href="#ux-improvements" title="Permanent link">¶</a></h3> +<h3 id="ux-improvements">UX Improvements<a class="headerlink" href="#ux-improvements" title="Permanent link">¶</a></h3> <ul> <li>Improved reporting of reasons why Comet cannot accelerate some operators and expressions</li> <li>New <code>spark.comet.logFallbackReasons.enabled</code> configuration setting for logging all fallback reasons</li> <li>CometScan nodes in the physical plan now show which scan implementation is being used (<code>native_comet</code>, <code>native_datafusion</code>, or <code>native_iceberg_compat</code>)</li> </ul> -<h3 id="bug-fixes">Bug Fixes<a class="headerlink" href="#bug-fixes" title="Permanent link">¶</a></h3> +<h3 id="bug-fixes">Bug Fixes<a class="headerlink" href="#bug-fixes" title="Permanent link">¶</a></h3> <ul> <li>Improved memory safety for FFI transfers</li> <li>Fixed a double-free issue in the shuffle unified memory pool</li> <li>Non zero offset FFI issue</li> <li>Fixed HDFS buffer read issue </li> </ul> -<h3 id="benchmarking">Benchmarking<a class="headerlink" href="#benchmarking" title="Permanent link">¶</a></h3> +<h3 id="benchmarking">Benchmarking<a class="headerlink" href="#benchmarking" title="Permanent link">¶</a></h3> <p>Benchmarking scripts for benchmarks based on TPC-H and TPS-DS are now available in the repository under <code>dev/benchmarks</code>.</p> -<h3 id="documentation-updates">Documentation Updates<a class="headerlink" href="#documentation-updates" title="Permanent link">¶</a></h3> +<h3 id="documentation-updates">Documentation Updates<a class="headerlink" href="#documentation-updates" title="Permanent link">¶</a></h3> <ul> <li>The documentation for supported <a href="https://datafusion.apache.org/comet/user-guide/latest/operators.html">operators</a> and <a href="https://datafusion.apache.org/comet/user-guide/latest/expressions.html">expressions</a> is now more complete, and Spark-compatibility status per operator/expression is now documented.</li> @@ -123,14 +140,14 @@ accelerated by Comet. Full support for ANSI mode will be available in a future r <li>New guide comparing Comet with Apache Gluten (incubating) + Velox</li> <li>User guides are now available for multiple Comet versions</li> </ul> -<h3 id="spark-compatibility">Spark Compatibility<a class="headerlink" href="#spark-compatibility" title="Permanent link">¶</a></h3> +<h3 id="spark-compatibility">Spark Compatibility<a class="headerlink" href="#spark-compatibility" title="Permanent link">¶</a></h3> <ul> <li>Spark 3.4.3 with JDK 11 & 17, Scala 2.12 & 2.13</li> <li>Spark 3.5.4 through 3.5.6 with JDK 11 & 17, Scala 2.12 & 2.13</li> <li>Experimental support for Spark 4.0.0 with JDK 17, Scala 2.13</li> </ul> <p>We are looking for help from the community to fully support Spark 4.0.0. See <a href="https://github.com/apache/datafusion-comet/issues/1637">EPIC: Support 4.0.0</a> for more information.</p> -<h2 id="getting-involved">Getting Involved<a class="headerlink" href="#getting-involved" title="Permanent link">¶</a></h2> +<h2 id="getting-involved">Getting Involved<a class="headerlink" href="#getting-involved" title="Permanent link">¶</a></h2> <p>The Comet project welcomes new contributors. We use the same <a href="https://datafusion.apache.org/contributor-guide/communication.html#slack-and-discord">Slack and Discord</a> channels as the main DataFusion project and have a weekly <a href="https://docs.google.com/document/d/1NBpkIAuU7O9h8Br5CbFksDhX-L9TyO9wmGLPMe0Plc8/edit?usp=sharing">DataFusion video call</a>.</p> <p>The easiest way to get involved is to test Comet with your current Spark jobs and file issues for any bugs or @@ -168,6 +185,23 @@ Comet.</p> <!-- Container where Giscus will render --> <div id="comment-thread"></div> </div> </div> + <aside class="d-none d-md-block col-md-4 col-xl-3 ms-xl-2"> + <div class="toc"><span class="toctitle">Contents</span><ul> +<li><a href="#release-highlights">Release Highlights</a><ul> +<li><a href="#improved-support-for-apache-iceberg">Improved Support for Apache Iceberg</a></li> +<li><a href="#improved-spark-400-support">Improved Spark 4.0.0 Support</a></li> +<li><a href="#new-functionality">New Functionality</a></li> +<li><a href="#ux-improvements">UX Improvements</a></li> +<li><a href="#bug-fixes">Bug Fixes</a></li> +<li><a href="#benchmarking">Benchmarking</a></li> +<li><a href="#documentation-updates">Documentation Updates</a></li> +<li><a href="#spark-compatibility">Spark Compatibility</a></li> +</ul> +</li> +<li><a href="#getting-involved">Getting Involved</a></li> +</ul> +</div> + </aside> </div> </div> </div> diff --git a/blog/author/pmc.html b/blog/author/pmc.html index a3c6358..0466942 100644 --- a/blog/author/pmc.html +++ b/blog/author/pmc.html @@ -46,11 +46,11 @@ See the License for the specific language governing permissions and limitations under the License. {% endcomment %} --> -<p>[TOC] -The Apache DataFusion PMC is pleased to announce version 0.10.0 of the <a href="https://datafusion.apache.org/comet/">Comet</a> subproject.</p> + +<p>The Apache DataFusion PMC is pleased to announce version 0.10.0 of the <a href="https://datafusion.apache.org/comet/">Comet</a> subproject.</p> <p>Comet is an accelerator for Apache Spark that translates Spark physical plans to DataFusion physical plans for improved performance and efficiency without requiring any code changes.</p> -<p>This release covers approximately ten weeks of …</p> </div><!-- /.entry-content --> +<p>This release covers approximately ten weeks of development …</p> </div><!-- /.entry-content --> </article></li> <li><article class="hentry"> <header> <h2 class="entry-title"><a href="https://datafusion.apache.org/blog/2025/07/28/datafusion-49.0.0" rel="bookmark" title="Permalink to Apache DataFusion 49.0.0 Released">Apache DataFusion 49.0.0 Released</a></h2> </header> diff --git a/blog/category/blog.html b/blog/category/blog.html index 36bc423..5c09028 100644 --- a/blog/category/blog.html +++ b/blog/category/blog.html @@ -47,11 +47,11 @@ See the License for the specific language governing permissions and limitations under the License. {% endcomment %} --> -<p>[TOC] -The Apache DataFusion PMC is pleased to announce version 0.10.0 of the <a href="https://datafusion.apache.org/comet/">Comet</a> subproject.</p> + +<p>The Apache DataFusion PMC is pleased to announce version 0.10.0 of the <a href="https://datafusion.apache.org/comet/">Comet</a> subproject.</p> <p>Comet is an accelerator for Apache Spark that translates Spark physical plans to DataFusion physical plans for improved performance and efficiency without requiring any code changes.</p> -<p>This release covers approximately ten weeks of …</p> </div><!-- /.entry-content --> +<p>This release covers approximately ten weeks of development …</p> </div><!-- /.entry-content --> </article></li> <li><article class="hentry"> <header> <h2 class="entry-title"><a href="https://datafusion.apache.org/blog/2025/09/10/dynamic-filters" rel="bookmark" title="Permalink to Dynamic Filters: Passing Information Between Operators During Execution for 25x Faster Queries">Dynamic Filters: Passing Information Between Operators During Execution for 25x Faster Queries</a></h2> </header> diff --git a/blog/feed.xml b/blog/feed.xml index f52f622..bf6d941 100644 --- a/blog/feed.xml +++ b/blog/feed.xml @@ -17,11 +17,11 @@ See the License for the specific language governing permissions and limitations under the License. {% endcomment %} --> -<p>[TOC] -The Apache DataFusion PMC is pleased to announce version 0.10.0 of the <a href="https://datafusion.apache.org/comet/">Comet</a> subproject.</p> + +<p>The Apache DataFusion PMC is pleased to announce version 0.10.0 of the <a href="https://datafusion.apache.org/comet/">Comet</a> subproject.</p> <p>Comet is an accelerator for Apache Spark that translates Spark physical plans to DataFusion physical plans for improved performance and efficiency without requiring any code changes.</p> -<p>This release covers approximately ten weeks of …</p></description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">pmc</dc:creator><pubDate>Sat, 13 Sep 2025 00:00:00 +0000</pubDate><guid isPermaLink="false">tag:datafusion.apache.org,2025-09-13:/blog/2025/09/13/datafusion-comet-0.10.0</guid><category>blog</category></item><item><title>Dynamic Filters: Passing Information Between Operators During Execution for 25x Faster Queries</title><link>https://datafusion.apache. [...] +<p>This release covers approximately ten weeks of development …</p></description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">pmc</dc:creator><pubDate>Sat, 13 Sep 2025 00:00:00 +0000</pubDate><guid isPermaLink="false">tag:datafusion.apache.org,2025-09-13:/blog/2025/09/13/datafusion-comet-0.10.0</guid><category>blog</category></item><item><title>Dynamic Filters: Passing Information Between Operators During Execution for 25x Faster Queries</title><link>https://datafu [...] {% comment %} Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with diff --git a/blog/feeds/all-en.atom.xml b/blog/feeds/all-en.atom.xml index df3f658..e47d8eb 100644 --- a/blog/feeds/all-en.atom.xml +++ b/blog/feeds/all-en.atom.xml @@ -17,11 +17,11 @@ See the License for the specific language governing permissions and limitations under the License. {% endcomment %} --> -<p>[TOC] -The Apache DataFusion PMC is pleased to announce version 0.10.0 of the <a href="https://datafusion.apache.org/comet/">Comet</a> subproject.</p> + +<p>The Apache DataFusion PMC is pleased to announce version 0.10.0 of the <a href="https://datafusion.apache.org/comet/">Comet</a> subproject.</p> <p>Comet is an accelerator for Apache Spark that translates Spark physical plans to DataFusion physical plans for improved performance and efficiency without requiring any code changes.</p> -<p>This release covers approximately ten weeks of …</p></summary><content type="html"><!-- +<p>This release covers approximately ten weeks of development …</p></summary><content type="html"><!-- {% comment %} Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with @@ -39,22 +39,22 @@ See the License for the specific language governing permissions and limitations under the License. {% endcomment %} --> -<p>[TOC] -The Apache DataFusion PMC is pleased to announce version 0.10.0 of the <a href="https://datafusion.apache.org/comet/">Comet</a> subproject.</p> + +<p>The Apache DataFusion PMC is pleased to announce version 0.10.0 of the <a href="https://datafusion.apache.org/comet/">Comet</a> subproject.</p> <p>Comet is an accelerator for Apache Spark that translates Spark physical plans to DataFusion physical plans for improved performance and efficiency without requiring any code changes.</p> <p>This release covers approximately ten weeks of development work and is the result of merging 183 PRs from 26 contributors. See the <a href="https://github.com/apache/datafusion-comet/blob/main/dev/changelog/0.10.0.md">change log</a> for more information.</p> -<h2 id="release-highlights">Release Highlights<a class="headerlink" href="#release-highlights" title="Permanent link">&para;</a></h2> -<h3 id="improved-support-for-apache-iceberg">Improved Support for Apache Iceberg<a class="headerlink" href="#improved-support-for-apache-iceberg" title="Permanent link">&para;</a></h3> +<h2 id="release-highlights">Release Highlights<a class="headerlink" href="#release-highlights" title="Permanent link">¶</a></h2> +<h3 id="improved-support-for-apache-iceberg">Improved Support for Apache Iceberg<a class="headerlink" href="#improved-support-for-apache-iceberg" title="Permanent link">¶</a></h3> <p>It is now possible to use Comet with Apache Iceberg 1.8.1 to accelerate reads of Iceberg Parquet tables.</p> -<h3 id="improved-spark-400-support">Improved Spark 4.0.0 Support<a class="headerlink" href="#improved-spark-400-support" title="Permanent link">&para;</a></h3> +<h3 id="improved-spark-400-support">Improved Spark 4.0.0 Support<a class="headerlink" href="#improved-spark-400-support" title="Permanent link">¶</a></h3> <p>Comet no longer falls back to Spark for all queries when ANSI mode is enabled (which is the default in Spark 4.0.0). Instead, Comet will now only fall back to Spark for arithmetic and aggregates expressions that support ANSI mode.</p> <p>Setting <code>spark.comet.ansi.ignore=true</code> will override this behavior and force these expressions to continue to be accelerated by Comet. Full support for ANSI mode will be available in a future release.</p> <p>Comet will now use the <code>native_iceberg_compat</code> scan for Spark 4.0.0 in most cases, which supports reading complex types.</p> -<h3 id="new-functionality">New Functionality<a class="headerlink" href="#new-functionality" title="Permanent link">&para;</a></h3> +<h3 id="new-functionality">New Functionality<a class="headerlink" href="#new-functionality" title="Permanent link">¶</a></h3> <p>The following SQL functions are now supported:</p> <ul> <li><code>array_min</code></li> @@ -75,23 +75,23 @@ accelerated by Comet. Full support for ANSI mode will be available in a future r <li>Support for array literals</li> <li>Support for limit with offset</li> </ul> -<h3 id="ux-improvements">UX Improvements<a class="headerlink" href="#ux-improvements" title="Permanent link">&para;</a></h3> +<h3 id="ux-improvements">UX Improvements<a class="headerlink" href="#ux-improvements" title="Permanent link">¶</a></h3> <ul> <li>Improved reporting of reasons why Comet cannot accelerate some operators and expressions</li> <li>New <code>spark.comet.logFallbackReasons.enabled</code> configuration setting for logging all fallback reasons</li> <li>CometScan nodes in the physical plan now show which scan implementation is being used (<code>native_comet</code>, <code>native_datafusion</code>, or <code>native_iceberg_compat</code>)</li> </ul> -<h3 id="bug-fixes">Bug Fixes<a class="headerlink" href="#bug-fixes" title="Permanent link">&para;</a></h3> +<h3 id="bug-fixes">Bug Fixes<a class="headerlink" href="#bug-fixes" title="Permanent link">¶</a></h3> <ul> <li>Improved memory safety for FFI transfers</li> <li>Fixed a double-free issue in the shuffle unified memory pool</li> <li>Non zero offset FFI issue</li> <li>Fixed HDFS buffer read issue </li> </ul> -<h3 id="benchmarking">Benchmarking<a class="headerlink" href="#benchmarking" title="Permanent link">&para;</a></h3> +<h3 id="benchmarking">Benchmarking<a class="headerlink" href="#benchmarking" title="Permanent link">¶</a></h3> <p>Benchmarking scripts for benchmarks based on TPC-H and TPS-DS are now available in the repository under <code>dev/benchmarks</code>.</p> -<h3 id="documentation-updates">Documentation Updates<a class="headerlink" href="#documentation-updates" title="Permanent link">&para;</a></h3> +<h3 id="documentation-updates">Documentation Updates<a class="headerlink" href="#documentation-updates" title="Permanent link">¶</a></h3> <ul> <li>The documentation for supported <a href="https://datafusion.apache.org/comet/user-guide/latest/operators.html">operators</a> and <a href="https://datafusion.apache.org/comet/user-guide/latest/expressions.html">expressions</a> is now more complete, and Spark-compatibility status per operator/expression is now documented.</li> @@ -99,14 +99,14 @@ accelerated by Comet. Full support for ANSI mode will be available in a future r <li>New guide comparing Comet with Apache Gluten (incubating) + Velox</li> <li>User guides are now available for multiple Comet versions</li> </ul> -<h3 id="spark-compatibility">Spark Compatibility<a class="headerlink" href="#spark-compatibility" title="Permanent link">&para;</a></h3> +<h3 id="spark-compatibility">Spark Compatibility<a class="headerlink" href="#spark-compatibility" title="Permanent link">¶</a></h3> <ul> <li>Spark 3.4.3 with JDK 11 &amp; 17, Scala 2.12 &amp; 2.13</li> <li>Spark 3.5.4 through 3.5.6 with JDK 11 &amp; 17, Scala 2.12 &amp; 2.13</li> <li>Experimental support for Spark 4.0.0 with JDK 17, Scala 2.13</li> </ul> <p>We are looking for help from the community to fully support Spark 4.0.0. See <a href="https://github.com/apache/datafusion-comet/issues/1637">EPIC: Support 4.0.0</a> for more information.</p> -<h2 id="getting-involved">Getting Involved<a class="headerlink" href="#getting-involved" title="Permanent link">&para;</a></h2> +<h2 id="getting-involved">Getting Involved<a class="headerlink" href="#getting-involved" title="Permanent link">¶</a></h2> <p>The Comet project welcomes new contributors. We use the same <a href="https://datafusion.apache.org/contributor-guide/communication.html#slack-and-discord">Slack and Discord</a> channels as the main DataFusion project and have a weekly <a href="https://docs.google.com/document/d/1NBpkIAuU7O9h8Br5CbFksDhX-L9TyO9wmGLPMe0Plc8/edit?usp=sharing">DataFusion video call</a>.</p> <p>The easiest way to get involved is to test Comet with your current Spark jobs and file issues for any bugs or diff --git a/blog/feeds/blog.atom.xml b/blog/feeds/blog.atom.xml index a483cfc..d7b83d1 100644 --- a/blog/feeds/blog.atom.xml +++ b/blog/feeds/blog.atom.xml @@ -17,11 +17,11 @@ See the License for the specific language governing permissions and limitations under the License. {% endcomment %} --> -<p>[TOC] -The Apache DataFusion PMC is pleased to announce version 0.10.0 of the <a href="https://datafusion.apache.org/comet/">Comet</a> subproject.</p> + +<p>The Apache DataFusion PMC is pleased to announce version 0.10.0 of the <a href="https://datafusion.apache.org/comet/">Comet</a> subproject.</p> <p>Comet is an accelerator for Apache Spark that translates Spark physical plans to DataFusion physical plans for improved performance and efficiency without requiring any code changes.</p> -<p>This release covers approximately ten weeks of …</p></summary><content type="html"><!-- +<p>This release covers approximately ten weeks of development …</p></summary><content type="html"><!-- {% comment %} Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with @@ -39,22 +39,22 @@ See the License for the specific language governing permissions and limitations under the License. {% endcomment %} --> -<p>[TOC] -The Apache DataFusion PMC is pleased to announce version 0.10.0 of the <a href="https://datafusion.apache.org/comet/">Comet</a> subproject.</p> + +<p>The Apache DataFusion PMC is pleased to announce version 0.10.0 of the <a href="https://datafusion.apache.org/comet/">Comet</a> subproject.</p> <p>Comet is an accelerator for Apache Spark that translates Spark physical plans to DataFusion physical plans for improved performance and efficiency without requiring any code changes.</p> <p>This release covers approximately ten weeks of development work and is the result of merging 183 PRs from 26 contributors. See the <a href="https://github.com/apache/datafusion-comet/blob/main/dev/changelog/0.10.0.md">change log</a> for more information.</p> -<h2 id="release-highlights">Release Highlights<a class="headerlink" href="#release-highlights" title="Permanent link">&para;</a></h2> -<h3 id="improved-support-for-apache-iceberg">Improved Support for Apache Iceberg<a class="headerlink" href="#improved-support-for-apache-iceberg" title="Permanent link">&para;</a></h3> +<h2 id="release-highlights">Release Highlights<a class="headerlink" href="#release-highlights" title="Permanent link">¶</a></h2> +<h3 id="improved-support-for-apache-iceberg">Improved Support for Apache Iceberg<a class="headerlink" href="#improved-support-for-apache-iceberg" title="Permanent link">¶</a></h3> <p>It is now possible to use Comet with Apache Iceberg 1.8.1 to accelerate reads of Iceberg Parquet tables.</p> -<h3 id="improved-spark-400-support">Improved Spark 4.0.0 Support<a class="headerlink" href="#improved-spark-400-support" title="Permanent link">&para;</a></h3> +<h3 id="improved-spark-400-support">Improved Spark 4.0.0 Support<a class="headerlink" href="#improved-spark-400-support" title="Permanent link">¶</a></h3> <p>Comet no longer falls back to Spark for all queries when ANSI mode is enabled (which is the default in Spark 4.0.0). Instead, Comet will now only fall back to Spark for arithmetic and aggregates expressions that support ANSI mode.</p> <p>Setting <code>spark.comet.ansi.ignore=true</code> will override this behavior and force these expressions to continue to be accelerated by Comet. Full support for ANSI mode will be available in a future release.</p> <p>Comet will now use the <code>native_iceberg_compat</code> scan for Spark 4.0.0 in most cases, which supports reading complex types.</p> -<h3 id="new-functionality">New Functionality<a class="headerlink" href="#new-functionality" title="Permanent link">&para;</a></h3> +<h3 id="new-functionality">New Functionality<a class="headerlink" href="#new-functionality" title="Permanent link">¶</a></h3> <p>The following SQL functions are now supported:</p> <ul> <li><code>array_min</code></li> @@ -75,23 +75,23 @@ accelerated by Comet. Full support for ANSI mode will be available in a future r <li>Support for array literals</li> <li>Support for limit with offset</li> </ul> -<h3 id="ux-improvements">UX Improvements<a class="headerlink" href="#ux-improvements" title="Permanent link">&para;</a></h3> +<h3 id="ux-improvements">UX Improvements<a class="headerlink" href="#ux-improvements" title="Permanent link">¶</a></h3> <ul> <li>Improved reporting of reasons why Comet cannot accelerate some operators and expressions</li> <li>New <code>spark.comet.logFallbackReasons.enabled</code> configuration setting for logging all fallback reasons</li> <li>CometScan nodes in the physical plan now show which scan implementation is being used (<code>native_comet</code>, <code>native_datafusion</code>, or <code>native_iceberg_compat</code>)</li> </ul> -<h3 id="bug-fixes">Bug Fixes<a class="headerlink" href="#bug-fixes" title="Permanent link">&para;</a></h3> +<h3 id="bug-fixes">Bug Fixes<a class="headerlink" href="#bug-fixes" title="Permanent link">¶</a></h3> <ul> <li>Improved memory safety for FFI transfers</li> <li>Fixed a double-free issue in the shuffle unified memory pool</li> <li>Non zero offset FFI issue</li> <li>Fixed HDFS buffer read issue </li> </ul> -<h3 id="benchmarking">Benchmarking<a class="headerlink" href="#benchmarking" title="Permanent link">&para;</a></h3> +<h3 id="benchmarking">Benchmarking<a class="headerlink" href="#benchmarking" title="Permanent link">¶</a></h3> <p>Benchmarking scripts for benchmarks based on TPC-H and TPS-DS are now available in the repository under <code>dev/benchmarks</code>.</p> -<h3 id="documentation-updates">Documentation Updates<a class="headerlink" href="#documentation-updates" title="Permanent link">&para;</a></h3> +<h3 id="documentation-updates">Documentation Updates<a class="headerlink" href="#documentation-updates" title="Permanent link">¶</a></h3> <ul> <li>The documentation for supported <a href="https://datafusion.apache.org/comet/user-guide/latest/operators.html">operators</a> and <a href="https://datafusion.apache.org/comet/user-guide/latest/expressions.html">expressions</a> is now more complete, and Spark-compatibility status per operator/expression is now documented.</li> @@ -99,14 +99,14 @@ accelerated by Comet. Full support for ANSI mode will be available in a future r <li>New guide comparing Comet with Apache Gluten (incubating) + Velox</li> <li>User guides are now available for multiple Comet versions</li> </ul> -<h3 id="spark-compatibility">Spark Compatibility<a class="headerlink" href="#spark-compatibility" title="Permanent link">&para;</a></h3> +<h3 id="spark-compatibility">Spark Compatibility<a class="headerlink" href="#spark-compatibility" title="Permanent link">¶</a></h3> <ul> <li>Spark 3.4.3 with JDK 11 &amp; 17, Scala 2.12 &amp; 2.13</li> <li>Spark 3.5.4 through 3.5.6 with JDK 11 &amp; 17, Scala 2.12 &amp; 2.13</li> <li>Experimental support for Spark 4.0.0 with JDK 17, Scala 2.13</li> </ul> <p>We are looking for help from the community to fully support Spark 4.0.0. See <a href="https://github.com/apache/datafusion-comet/issues/1637">EPIC: Support 4.0.0</a> for more information.</p> -<h2 id="getting-involved">Getting Involved<a class="headerlink" href="#getting-involved" title="Permanent link">&para;</a></h2> +<h2 id="getting-involved">Getting Involved<a class="headerlink" href="#getting-involved" title="Permanent link">¶</a></h2> <p>The Comet project welcomes new contributors. We use the same <a href="https://datafusion.apache.org/contributor-guide/communication.html#slack-and-discord">Slack and Discord</a> channels as the main DataFusion project and have a weekly <a href="https://docs.google.com/document/d/1NBpkIAuU7O9h8Br5CbFksDhX-L9TyO9wmGLPMe0Plc8/edit?usp=sharing">DataFusion video call</a>.</p> <p>The easiest way to get involved is to test Comet with your current Spark jobs and file issues for any bugs or diff --git a/blog/feeds/pmc.atom.xml b/blog/feeds/pmc.atom.xml index 4994d24..0caaf79 100644 --- a/blog/feeds/pmc.atom.xml +++ b/blog/feeds/pmc.atom.xml @@ -17,11 +17,11 @@ See the License for the specific language governing permissions and limitations under the License. {% endcomment %} --> -<p>[TOC] -The Apache DataFusion PMC is pleased to announce version 0.10.0 of the <a href="https://datafusion.apache.org/comet/">Comet</a> subproject.</p> + +<p>The Apache DataFusion PMC is pleased to announce version 0.10.0 of the <a href="https://datafusion.apache.org/comet/">Comet</a> subproject.</p> <p>Comet is an accelerator for Apache Spark that translates Spark physical plans to DataFusion physical plans for improved performance and efficiency without requiring any code changes.</p> -<p>This release covers approximately ten weeks of …</p></summary><content type="html"><!-- +<p>This release covers approximately ten weeks of development …</p></summary><content type="html"><!-- {% comment %} Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with @@ -39,22 +39,22 @@ See the License for the specific language governing permissions and limitations under the License. {% endcomment %} --> -<p>[TOC] -The Apache DataFusion PMC is pleased to announce version 0.10.0 of the <a href="https://datafusion.apache.org/comet/">Comet</a> subproject.</p> + +<p>The Apache DataFusion PMC is pleased to announce version 0.10.0 of the <a href="https://datafusion.apache.org/comet/">Comet</a> subproject.</p> <p>Comet is an accelerator for Apache Spark that translates Spark physical plans to DataFusion physical plans for improved performance and efficiency without requiring any code changes.</p> <p>This release covers approximately ten weeks of development work and is the result of merging 183 PRs from 26 contributors. See the <a href="https://github.com/apache/datafusion-comet/blob/main/dev/changelog/0.10.0.md">change log</a> for more information.</p> -<h2 id="release-highlights">Release Highlights<a class="headerlink" href="#release-highlights" title="Permanent link">&para;</a></h2> -<h3 id="improved-support-for-apache-iceberg">Improved Support for Apache Iceberg<a class="headerlink" href="#improved-support-for-apache-iceberg" title="Permanent link">&para;</a></h3> +<h2 id="release-highlights">Release Highlights<a class="headerlink" href="#release-highlights" title="Permanent link">¶</a></h2> +<h3 id="improved-support-for-apache-iceberg">Improved Support for Apache Iceberg<a class="headerlink" href="#improved-support-for-apache-iceberg" title="Permanent link">¶</a></h3> <p>It is now possible to use Comet with Apache Iceberg 1.8.1 to accelerate reads of Iceberg Parquet tables.</p> -<h3 id="improved-spark-400-support">Improved Spark 4.0.0 Support<a class="headerlink" href="#improved-spark-400-support" title="Permanent link">&para;</a></h3> +<h3 id="improved-spark-400-support">Improved Spark 4.0.0 Support<a class="headerlink" href="#improved-spark-400-support" title="Permanent link">¶</a></h3> <p>Comet no longer falls back to Spark for all queries when ANSI mode is enabled (which is the default in Spark 4.0.0). Instead, Comet will now only fall back to Spark for arithmetic and aggregates expressions that support ANSI mode.</p> <p>Setting <code>spark.comet.ansi.ignore=true</code> will override this behavior and force these expressions to continue to be accelerated by Comet. Full support for ANSI mode will be available in a future release.</p> <p>Comet will now use the <code>native_iceberg_compat</code> scan for Spark 4.0.0 in most cases, which supports reading complex types.</p> -<h3 id="new-functionality">New Functionality<a class="headerlink" href="#new-functionality" title="Permanent link">&para;</a></h3> +<h3 id="new-functionality">New Functionality<a class="headerlink" href="#new-functionality" title="Permanent link">¶</a></h3> <p>The following SQL functions are now supported:</p> <ul> <li><code>array_min</code></li> @@ -75,23 +75,23 @@ accelerated by Comet. Full support for ANSI mode will be available in a future r <li>Support for array literals</li> <li>Support for limit with offset</li> </ul> -<h3 id="ux-improvements">UX Improvements<a class="headerlink" href="#ux-improvements" title="Permanent link">&para;</a></h3> +<h3 id="ux-improvements">UX Improvements<a class="headerlink" href="#ux-improvements" title="Permanent link">¶</a></h3> <ul> <li>Improved reporting of reasons why Comet cannot accelerate some operators and expressions</li> <li>New <code>spark.comet.logFallbackReasons.enabled</code> configuration setting for logging all fallback reasons</li> <li>CometScan nodes in the physical plan now show which scan implementation is being used (<code>native_comet</code>, <code>native_datafusion</code>, or <code>native_iceberg_compat</code>)</li> </ul> -<h3 id="bug-fixes">Bug Fixes<a class="headerlink" href="#bug-fixes" title="Permanent link">&para;</a></h3> +<h3 id="bug-fixes">Bug Fixes<a class="headerlink" href="#bug-fixes" title="Permanent link">¶</a></h3> <ul> <li>Improved memory safety for FFI transfers</li> <li>Fixed a double-free issue in the shuffle unified memory pool</li> <li>Non zero offset FFI issue</li> <li>Fixed HDFS buffer read issue </li> </ul> -<h3 id="benchmarking">Benchmarking<a class="headerlink" href="#benchmarking" title="Permanent link">&para;</a></h3> +<h3 id="benchmarking">Benchmarking<a class="headerlink" href="#benchmarking" title="Permanent link">¶</a></h3> <p>Benchmarking scripts for benchmarks based on TPC-H and TPS-DS are now available in the repository under <code>dev/benchmarks</code>.</p> -<h3 id="documentation-updates">Documentation Updates<a class="headerlink" href="#documentation-updates" title="Permanent link">&para;</a></h3> +<h3 id="documentation-updates">Documentation Updates<a class="headerlink" href="#documentation-updates" title="Permanent link">¶</a></h3> <ul> <li>The documentation for supported <a href="https://datafusion.apache.org/comet/user-guide/latest/operators.html">operators</a> and <a href="https://datafusion.apache.org/comet/user-guide/latest/expressions.html">expressions</a> is now more complete, and Spark-compatibility status per operator/expression is now documented.</li> @@ -99,14 +99,14 @@ accelerated by Comet. Full support for ANSI mode will be available in a future r <li>New guide comparing Comet with Apache Gluten (incubating) + Velox</li> <li>User guides are now available for multiple Comet versions</li> </ul> -<h3 id="spark-compatibility">Spark Compatibility<a class="headerlink" href="#spark-compatibility" title="Permanent link">&para;</a></h3> +<h3 id="spark-compatibility">Spark Compatibility<a class="headerlink" href="#spark-compatibility" title="Permanent link">¶</a></h3> <ul> <li>Spark 3.4.3 with JDK 11 &amp; 17, Scala 2.12 &amp; 2.13</li> <li>Spark 3.5.4 through 3.5.6 with JDK 11 &amp; 17, Scala 2.12 &amp; 2.13</li> <li>Experimental support for Spark 4.0.0 with JDK 17, Scala 2.13</li> </ul> <p>We are looking for help from the community to fully support Spark 4.0.0. See <a href="https://github.com/apache/datafusion-comet/issues/1637">EPIC: Support 4.0.0</a> for more information.</p> -<h2 id="getting-involved">Getting Involved<a class="headerlink" href="#getting-involved" title="Permanent link">&para;</a></h2> +<h2 id="getting-involved">Getting Involved<a class="headerlink" href="#getting-involved" title="Permanent link">¶</a></h2> <p>The Comet project welcomes new contributors. We use the same <a href="https://datafusion.apache.org/contributor-guide/communication.html#slack-and-discord">Slack and Discord</a> channels as the main DataFusion project and have a weekly <a href="https://docs.google.com/document/d/1NBpkIAuU7O9h8Br5CbFksDhX-L9TyO9wmGLPMe0Plc8/edit?usp=sharing">DataFusion video call</a>.</p> <p>The easiest way to get involved is to test Comet with your current Spark jobs and file issues for any bugs or diff --git a/blog/feeds/pmc.rss.xml b/blog/feeds/pmc.rss.xml index fd04169..64ee2a5 100644 --- a/blog/feeds/pmc.rss.xml +++ b/blog/feeds/pmc.rss.xml @@ -17,11 +17,11 @@ See the License for the specific language governing permissions and limitations under the License. {% endcomment %} --> -<p>[TOC] -The Apache DataFusion PMC is pleased to announce version 0.10.0 of the <a href="https://datafusion.apache.org/comet/">Comet</a> subproject.</p> + +<p>The Apache DataFusion PMC is pleased to announce version 0.10.0 of the <a href="https://datafusion.apache.org/comet/">Comet</a> subproject.</p> <p>Comet is an accelerator for Apache Spark that translates Spark physical plans to DataFusion physical plans for improved performance and efficiency without requiring any code changes.</p> -<p>This release covers approximately ten weeks of …</p></description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">pmc</dc:creator><pubDate>Sat, 13 Sep 2025 00:00:00 +0000</pubDate><guid isPermaLink="false">tag:datafusion.apache.org,2025-09-13:/blog/2025/09/13/datafusion-comet-0.10.0</guid><category>blog</category></item><item><title>Apache DataFusion 49.0.0 Released</title><link>https://datafusion.apache.org/blog/2025/07/28/datafusion-49.0.0</link><description><!-- +<p>This release covers approximately ten weeks of development …</p></description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">pmc</dc:creator><pubDate>Sat, 13 Sep 2025 00:00:00 +0000</pubDate><guid isPermaLink="false">tag:datafusion.apache.org,2025-09-13:/blog/2025/09/13/datafusion-comet-0.10.0</guid><category>blog</category></item><item><title>Apache DataFusion 49.0.0 Released</title><link>https://datafusion.apache.org/blog/2025/07/28/datafusion-49.0.0</link><desc [...] {% comment %} Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with diff --git a/blog/index.html b/blog/index.html index 4694caf..469bae0 100644 --- a/blog/index.html +++ b/blog/index.html @@ -71,11 +71,11 @@ See the License for the specific language governing permissions and limitations under the License. {% endcomment %} --> -<p>[TOC] -The Apache DataFusion PMC is pleased to announce version 0.10.0 of the <a href="https://datafusion.apache.org/comet/">Comet</a> subproject.</p> + +<p>The Apache DataFusion PMC is pleased to announce version 0.10.0 of the <a href="https://datafusion.apache.org/comet/">Comet</a> subproject.</p> <p>Comet is an accelerator for Apache Spark that translates Spark physical plans to DataFusion physical plans for improved performance and efficiency without requiring any code changes.</p> -<p>This release covers approximately ten weeks of …</p></p> +<p>This release covers approximately ten weeks of development …</p></p> <footer> <ul class="actions"> <div style="text-align: right"><a href="/blog/2025/09/13/datafusion-comet-0.10.0" class="button medium">Continue Reading</a></div> --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@datafusion.apache.org For additional commands, e-mail: commits-h...@datafusion.apache.org