(datafusion-site) branch asf-staging updated: Commit build products

github-bot Thu, 20 Mar 2025 16:26:21 -0700

This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/datafusion-site.git



The following commit(s) were added to refs/heads/asf-staging by this push:
     new f1d6da6  Commit build products
f1d6da6 is described below

commit f1d6da64d930c0b65091595aeace9bd2a088dc51
Author: Build Pelican (action) <priv...@infra.apache.org>
AuthorDate: Thu Mar 20 21:46:42 2025 +0000

    Commit build products
---
 blog/2025/03/20/datafusion-comet-0.7.0/index.html | 16 +++++++++-------
 blog/feeds/all-en.atom.xml                        | 16 +++++++++-------
 blog/feeds/blog.atom.xml                          | 16 +++++++++-------
 blog/feeds/pmc.atom.xml                           | 16 +++++++++-------
 4 files changed, 36 insertions(+), 28 deletions(-)

diff --git a/blog/2025/03/20/datafusion-comet-0.7.0/index.html 
b/blog/2025/03/20/datafusion-comet-0.7.0/index.html
index 763cd9e..92044f5 100644
--- a/blog/2025/03/20/datafusion-comet-0.7.0/index.html
+++ b/blog/2025/03/20/datafusion-comet-0.7.0/index.html
@@ -95,16 +95,18 @@ stored locally in Parquet format on NVMe storage. Spark was 
running in Kubernete
 <p>When using the <code>spark.comet.exec.replaceSortMergeJoin</code> setting 
to replace sort-merge joins with hash joins, Comet 
 will now do a better job of picking the optimal build side. Thanks to <a 
href="https://github.com/hayman42";>@hayman42</a> for suggesting this, and 
thanks to the 
 <a href="https://github.com/apache/incubator-gluten/";>Apache 
Gluten(incubating)</a> project for the inspiration in implementing this 
feature.</p>
-<h2>Experimental Support for DataFusion&rsquo;s DataSourceExec</h2>
-<p>It is now possible to configure Comet to use DataFusion&rsquo;s 
<code>DataSourceExec</code> instead of Comet&rsquo;s current Parquet reader. 
-Support should still be considered experimental, but most of Comet&rsquo;s 
unit tests are now passing with the new reader. 
+<h2>Experimental Support for DataFusion&rsquo;s Parquet Scan</h2>
+<p>It is now possible to configure Comet to use DataFusion&rsquo;s Parquet 
reader instead of Comet&rsquo;s current Parquet reader. This 
+has the advantage of supporting complex types, and also has performance 
optimizations that are not present in Comet's 
+existing reader.</p>
+<p>Support should still be considered experimental, but most of Comet&rsquo;s 
unit tests are now passing with the new reader. 
 Known issues include handling of <code>INT96</code> timestamps and unsigned 
bytes and shorts.</p>
-<p>To enable DataFusion&rsquo;s <code>DataSourceExec</code>, either set 
<code>spark.comet.scan.impl=native_datafusion</code> or set the environment 
+<p>To enable DataFusion&rsquo;s Parquet reader, either set 
<code>spark.comet.scan.impl=native_datafusion</code> or set the environment 
 variable <code>COMET_PARQUET_SCAN_IMPL=native_datafusion</code>.</p>
 <h2>Complex Type Support</h2>
-<p>With DataFusion&rsquo;s <code>DataSourceExec</code> enabled, there is now 
some early support for reading structs from Parquet. This is 
-largely untested and we would welcome additional testing from the community to 
help determine what is and isn&rsquo;t working, 
-as well as contributions to improve support for structs and other complex 
types. The tracking issue is 
+<p>With DataFusion&rsquo;s Parquet reader enabled, there is now some early 
support for reading structs from Parquet. This is 
+not thoroughly tested yet. We would welcome additional testing from the 
community to help determine what is and isn&rsquo;t 
+working, as well as contributions to improve support for structs and other 
complex types. The tracking issue is 
 <a 
href="https://github.com/apache/datafusion-comet/issues/1043";>https://github.com/apache/datafusion-comet/issues/1043</a>.</p>
 <h2>Updates to supported Spark versions</h2>
 <ul>
diff --git a/blog/feeds/all-en.atom.xml b/blog/feeds/all-en.atom.xml
index 037f4fa..2dc033d 100644
--- a/blog/feeds/all-en.atom.xml
+++ b/blog/feeds/all-en.atom.xml
@@ -72,16 +72,18 @@ stored locally in Parquet format on NVMe storage. Spark was 
running in Kubernete
 &lt;p&gt;When using the 
&lt;code&gt;spark.comet.exec.replaceSortMergeJoin&lt;/code&gt; setting to 
replace sort-merge joins with hash joins, Comet 
 will now do a better job of picking the optimal build side. Thanks to &lt;a 
href="https://github.com/hayman42"&gt;@hayman42&lt;/a&gt; for suggesting this, 
and thanks to the 
 &lt;a href="https://github.com/apache/incubator-gluten/"&gt;Apache 
Gluten(incubating)&lt;/a&gt; project for the inspiration in implementing this 
feature.&lt;/p&gt;
-&lt;h2&gt;Experimental Support for DataFusion&amp;rsquo;s 
DataSourceExec&lt;/h2&gt;
-&lt;p&gt;It is now possible to configure Comet to use DataFusion&amp;rsquo;s 
&lt;code&gt;DataSourceExec&lt;/code&gt; instead of Comet&amp;rsquo;s current 
Parquet reader. 
-Support should still be considered experimental, but most of Comet&amp;rsquo;s 
unit tests are now passing with the new reader. 
+&lt;h2&gt;Experimental Support for DataFusion&amp;rsquo;s Parquet 
Scan&lt;/h2&gt;
+&lt;p&gt;It is now possible to configure Comet to use DataFusion&amp;rsquo;s 
Parquet reader instead of Comet&amp;rsquo;s current Parquet reader. This 
+has the advantage of supporting complex types, and also has performance 
optimizations that are not present in Comet's 
+existing reader.&lt;/p&gt;
+&lt;p&gt;Support should still be considered experimental, but most of 
Comet&amp;rsquo;s unit tests are now passing with the new reader. 
 Known issues include handling of &lt;code&gt;INT96&lt;/code&gt; timestamps and 
unsigned bytes and shorts.&lt;/p&gt;
-&lt;p&gt;To enable DataFusion&amp;rsquo;s 
&lt;code&gt;DataSourceExec&lt;/code&gt;, either set 
&lt;code&gt;spark.comet.scan.impl=native_datafusion&lt;/code&gt; or set the 
environment 
+&lt;p&gt;To enable DataFusion&amp;rsquo;s Parquet reader, either set 
&lt;code&gt;spark.comet.scan.impl=native_datafusion&lt;/code&gt; or set the 
environment 
 variable 
&lt;code&gt;COMET_PARQUET_SCAN_IMPL=native_datafusion&lt;/code&gt;.&lt;/p&gt;
 &lt;h2&gt;Complex Type Support&lt;/h2&gt;
-&lt;p&gt;With DataFusion&amp;rsquo;s &lt;code&gt;DataSourceExec&lt;/code&gt; 
enabled, there is now some early support for reading structs from Parquet. This 
is 
-largely untested and we would welcome additional testing from the community to 
help determine what is and isn&amp;rsquo;t working, 
-as well as contributions to improve support for structs and other complex 
types. The tracking issue is 
+&lt;p&gt;With DataFusion&amp;rsquo;s Parquet reader enabled, there is now some 
early support for reading structs from Parquet. This is 
+not thoroughly tested yet. We would welcome additional testing from the 
community to help determine what is and isn&amp;rsquo;t 
+working, as well as contributions to improve support for structs and other 
complex types. The tracking issue is 
 &lt;a 
href="https://github.com/apache/datafusion-comet/issues/1043"&gt;https://github.com/apache/datafusion-comet/issues/1043&lt;/a&gt;.&lt;/p&gt;
 &lt;h2&gt;Updates to supported Spark versions&lt;/h2&gt;
 &lt;ul&gt;
diff --git a/blog/feeds/blog.atom.xml b/blog/feeds/blog.atom.xml
index c9d12e0..5de58a8 100644
--- a/blog/feeds/blog.atom.xml
+++ b/blog/feeds/blog.atom.xml
@@ -72,16 +72,18 @@ stored locally in Parquet format on NVMe storage. Spark was 
running in Kubernete
 &lt;p&gt;When using the 
&lt;code&gt;spark.comet.exec.replaceSortMergeJoin&lt;/code&gt; setting to 
replace sort-merge joins with hash joins, Comet 
 will now do a better job of picking the optimal build side. Thanks to &lt;a 
href="https://github.com/hayman42"&gt;@hayman42&lt;/a&gt; for suggesting this, 
and thanks to the 
 &lt;a href="https://github.com/apache/incubator-gluten/"&gt;Apache 
Gluten(incubating)&lt;/a&gt; project for the inspiration in implementing this 
feature.&lt;/p&gt;
-&lt;h2&gt;Experimental Support for DataFusion&amp;rsquo;s 
DataSourceExec&lt;/h2&gt;
-&lt;p&gt;It is now possible to configure Comet to use DataFusion&amp;rsquo;s 
&lt;code&gt;DataSourceExec&lt;/code&gt; instead of Comet&amp;rsquo;s current 
Parquet reader. 
-Support should still be considered experimental, but most of Comet&amp;rsquo;s 
unit tests are now passing with the new reader. 
+&lt;h2&gt;Experimental Support for DataFusion&amp;rsquo;s Parquet 
Scan&lt;/h2&gt;
+&lt;p&gt;It is now possible to configure Comet to use DataFusion&amp;rsquo;s 
Parquet reader instead of Comet&amp;rsquo;s current Parquet reader. This 
+has the advantage of supporting complex types, and also has performance 
optimizations that are not present in Comet's 
+existing reader.&lt;/p&gt;
+&lt;p&gt;Support should still be considered experimental, but most of 
Comet&amp;rsquo;s unit tests are now passing with the new reader. 
 Known issues include handling of &lt;code&gt;INT96&lt;/code&gt; timestamps and 
unsigned bytes and shorts.&lt;/p&gt;
-&lt;p&gt;To enable DataFusion&amp;rsquo;s 
&lt;code&gt;DataSourceExec&lt;/code&gt;, either set 
&lt;code&gt;spark.comet.scan.impl=native_datafusion&lt;/code&gt; or set the 
environment 
+&lt;p&gt;To enable DataFusion&amp;rsquo;s Parquet reader, either set 
&lt;code&gt;spark.comet.scan.impl=native_datafusion&lt;/code&gt; or set the 
environment 
 variable 
&lt;code&gt;COMET_PARQUET_SCAN_IMPL=native_datafusion&lt;/code&gt;.&lt;/p&gt;
 &lt;h2&gt;Complex Type Support&lt;/h2&gt;
-&lt;p&gt;With DataFusion&amp;rsquo;s &lt;code&gt;DataSourceExec&lt;/code&gt; 
enabled, there is now some early support for reading structs from Parquet. This 
is 
-largely untested and we would welcome additional testing from the community to 
help determine what is and isn&amp;rsquo;t working, 
-as well as contributions to improve support for structs and other complex 
types. The tracking issue is 
+&lt;p&gt;With DataFusion&amp;rsquo;s Parquet reader enabled, there is now some 
early support for reading structs from Parquet. This is 
+not thoroughly tested yet. We would welcome additional testing from the 
community to help determine what is and isn&amp;rsquo;t 
+working, as well as contributions to improve support for structs and other 
complex types. The tracking issue is 
 &lt;a 
href="https://github.com/apache/datafusion-comet/issues/1043"&gt;https://github.com/apache/datafusion-comet/issues/1043&lt;/a&gt;.&lt;/p&gt;
 &lt;h2&gt;Updates to supported Spark versions&lt;/h2&gt;
 &lt;ul&gt;
diff --git a/blog/feeds/pmc.atom.xml b/blog/feeds/pmc.atom.xml
index 60b6dbe..9bbf15c 100644
--- a/blog/feeds/pmc.atom.xml
+++ b/blog/feeds/pmc.atom.xml
@@ -72,16 +72,18 @@ stored locally in Parquet format on NVMe storage. Spark was 
running in Kubernete
 &lt;p&gt;When using the 
&lt;code&gt;spark.comet.exec.replaceSortMergeJoin&lt;/code&gt; setting to 
replace sort-merge joins with hash joins, Comet 
 will now do a better job of picking the optimal build side. Thanks to &lt;a 
href="https://github.com/hayman42"&gt;@hayman42&lt;/a&gt; for suggesting this, 
and thanks to the 
 &lt;a href="https://github.com/apache/incubator-gluten/"&gt;Apache 
Gluten(incubating)&lt;/a&gt; project for the inspiration in implementing this 
feature.&lt;/p&gt;
-&lt;h2&gt;Experimental Support for DataFusion&amp;rsquo;s 
DataSourceExec&lt;/h2&gt;
-&lt;p&gt;It is now possible to configure Comet to use DataFusion&amp;rsquo;s 
&lt;code&gt;DataSourceExec&lt;/code&gt; instead of Comet&amp;rsquo;s current 
Parquet reader. 
-Support should still be considered experimental, but most of Comet&amp;rsquo;s 
unit tests are now passing with the new reader. 
+&lt;h2&gt;Experimental Support for DataFusion&amp;rsquo;s Parquet 
Scan&lt;/h2&gt;
+&lt;p&gt;It is now possible to configure Comet to use DataFusion&amp;rsquo;s 
Parquet reader instead of Comet&amp;rsquo;s current Parquet reader. This 
+has the advantage of supporting complex types, and also has performance 
optimizations that are not present in Comet's 
+existing reader.&lt;/p&gt;
+&lt;p&gt;Support should still be considered experimental, but most of 
Comet&amp;rsquo;s unit tests are now passing with the new reader. 
 Known issues include handling of &lt;code&gt;INT96&lt;/code&gt; timestamps and 
unsigned bytes and shorts.&lt;/p&gt;
-&lt;p&gt;To enable DataFusion&amp;rsquo;s 
&lt;code&gt;DataSourceExec&lt;/code&gt;, either set 
&lt;code&gt;spark.comet.scan.impl=native_datafusion&lt;/code&gt; or set the 
environment 
+&lt;p&gt;To enable DataFusion&amp;rsquo;s Parquet reader, either set 
&lt;code&gt;spark.comet.scan.impl=native_datafusion&lt;/code&gt; or set the 
environment 
 variable 
&lt;code&gt;COMET_PARQUET_SCAN_IMPL=native_datafusion&lt;/code&gt;.&lt;/p&gt;
 &lt;h2&gt;Complex Type Support&lt;/h2&gt;
-&lt;p&gt;With DataFusion&amp;rsquo;s &lt;code&gt;DataSourceExec&lt;/code&gt; 
enabled, there is now some early support for reading structs from Parquet. This 
is 
-largely untested and we would welcome additional testing from the community to 
help determine what is and isn&amp;rsquo;t working, 
-as well as contributions to improve support for structs and other complex 
types. The tracking issue is 
+&lt;p&gt;With DataFusion&amp;rsquo;s Parquet reader enabled, there is now some 
early support for reading structs from Parquet. This is 
+not thoroughly tested yet. We would welcome additional testing from the 
community to help determine what is and isn&amp;rsquo;t 
+working, as well as contributions to improve support for structs and other 
complex types. The tracking issue is 
 &lt;a 
href="https://github.com/apache/datafusion-comet/issues/1043"&gt;https://github.com/apache/datafusion-comet/issues/1043&lt;/a&gt;.&lt;/p&gt;
 &lt;h2&gt;Updates to supported Spark versions&lt;/h2&gt;
 &lt;ul&gt;


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@datafusion.apache.org
For additional commands, e-mail: commits-h...@datafusion.apache.org

(datafusion-site) branch asf-staging updated: Commit build products

Reply via email to