(datafusion-site) branch asf-staging updated: Commit build products

github-bot Mon, 08 Sep 2025 13:15:25 -0700

This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/datafusion-site.git



The following commit(s) were added to refs/heads/asf-staging by this push:
     new 824c253  Commit build products
824c253 is described below

commit 824c253213d0903aad940570b467f0fcaec477ca
Author: Build Pelican (action) <priv...@infra.apache.org>
AuthorDate: Mon Sep 8 15:27:55 2025 +0000

    Commit build products
---
 blog/2025/09/10/dynamic-filters/index.html                   | 12 ++++++------
 ...garcia-badaracco-pydantic-andrew-lamb-influxdata.atom.xml | 12 ++++++------
 blog/feeds/all-en.atom.xml                                   | 12 ++++++------
 blog/feeds/blog.atom.xml                                     | 12 ++++++------
 4 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/blog/2025/09/10/dynamic-filters/index.html 
b/blog/2025/09/10/dynamic-filters/index.html
index 64a7645..280c9cf 100644
--- a/blog/2025/09/10/dynamic-filters/index.html
+++ b/blog/2025/09/10/dynamic-filters/index.html
@@ -230,11 +230,11 @@ data tends to be <em>roughly</em> sorted (e.g. if you 
append to files as you rec
 it) but that does not guarantee that it is fully sorted, either within or 
between
 files. </p>
 <p>We <a href="https://github.com/apache/datafusion/issues/15037";>discussed 
possible solutions</a> with the community, and ultimately decided to
-implement a generic "dynamic filters", which is general enough to be used in
+implement generic "dynamic filters", which are general enough to be used in
 joins as well (see next section). Our implementation appears very similar to
 recently announced optimizations in closed-source, commercial systems such as
 <a href="https://program.berlinbuzzwords.de/bbuzz24/talk/3DTQJB/";>Accelerating 
TopK Queries in Snowflake</a>, or <a 
href="https://www.alibabacloud.com/blog/about-database-kernel-%7C-learn-about-polardb-imci-optimization-techniques_600274";>self-sharpening
 runtime filters in
-Alibaba Cloud's PolarDB</a>, and we are excited we can offer similar features
+Alibaba Cloud's PolarDB</a>, and we are excited that we can offer similar 
features
 in an open source query engine like DataFusion.</p>
 <p>At the query plan level, Q23 looks like this before it is executed:</p>
 <pre><code 
class="language-text">&boxdr;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxdl;
@@ -259,7 +259,7 @@ in an open source query engine like DataFusion.</p>
 filter is shown as <code>true</code> in the <code>predicate</code> field of 
the <code>DataSourceExec</code>
 operator.</p>
 <p>The dynamic filter is updated by the <code>SortExec(TopK)</code> operator 
during execution
-as it processes rows, as shown in Figure 6.</p>
+as shown in Figure 6.</p>
 <pre><code 
class="language-text">&boxdr;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxh;&boxdl;
 &boxv;       SortExec(TopK)      &boxv;
 &boxv;    --------------------   &boxv;
@@ -427,7 +427,7 @@ make working with dynamic filters more performant for 
specific use cases:</p>
   support specific static filter patterns (e.g. stats pruning rewrites).</p>
 </li>
 </ul>
-<p>This is all implementing in the <code>DynamicFilterPhysicalExpr</code> 
struct.</p>
+<p>This is all implemented in the <code>DynamicFilterPhysicalExpr</code> 
struct.</p>
 <p>Another important design point was handling concurrency and information
 flow. In early designs, the scan polled the source operators on every row /
 batch, which had significant overhead. The final design is a "push" model where
@@ -439,12 +439,12 @@ operator.</p>
 <h2>Future Work</h2>
 <p>Although we've made great progress and DataFusion now has one of the most
 advanced open-source dynamic filter / sideways information passing
-implementations that we know of, we seemany areas of future improvement such 
as:</p>
+implementations that we know of, we see many areas of future improvement such 
as:</p>
 <ul>
 <li>
 <p><a href="https://github.com/apache/datafusion/issues/16973";>Support for 
more types of joins</a>: This optimization is only implemented for
   <code>INNER</code> hash joins so far, but it could be implemented for other 
join algorithms
-  (e.g. nested loop joins, ) and join types (e.g. <code>LEFT OUTER 
JOIN</code>).</p>
+  (e.g. nested loop joins) and join types (e.g. <code>LEFT OUTER 
JOIN</code>).</p>
 </li>
 <li>
 <p><a href="https://github.com/apache/datafusion/issues/17171";>Push down 
entire hash tables to the scan operator</a>: Improve the representation
diff --git 
a/blog/feeds/adrian-garcia-badaracco-pydantic-andrew-lamb-influxdata.atom.xml 
b/blog/feeds/adrian-garcia-badaracco-pydantic-andrew-lamb-influxdata.atom.xml
index 6065990..b1b3934 100644
--- 
a/blog/feeds/adrian-garcia-badaracco-pydantic-andrew-lamb-influxdata.atom.xml
+++ 
b/blog/feeds/adrian-garcia-badaracco-pydantic-andrew-lamb-influxdata.atom.xml
@@ -214,11 +214,11 @@ data tends to be &lt;em&gt;roughly&lt;/em&gt; sorted 
(e.g. if you append to file
 it) but that does not guarantee that it is fully sorted, either within or 
between
 files. &lt;/p&gt;
 &lt;p&gt;We &lt;a 
href="https://github.com/apache/datafusion/issues/15037"&gt;discussed possible 
solutions&lt;/a&gt; with the community, and ultimately decided to
-implement a generic "dynamic filters", which is general enough to be used in
+implement generic "dynamic filters", which are general enough to be used in
 joins as well (see next section). Our implementation appears very similar to
 recently announced optimizations in closed-source, commercial systems such as
 &lt;a 
href="https://program.berlinbuzzwords.de/bbuzz24/talk/3DTQJB/"&gt;Accelerating 
TopK Queries in Snowflake&lt;/a&gt;, or &lt;a 
href="https://www.alibabacloud.com/blog/about-database-kernel-%7C-learn-about-polardb-imci-optimization-techniques_600274"&gt;self-sharpening
 runtime filters in
-Alibaba Cloud's PolarDB&lt;/a&gt;, and we are excited we can offer similar 
features
+Alibaba Cloud's PolarDB&lt;/a&gt;, and we are excited that we can offer 
similar features
 in an open source query engine like DataFusion.&lt;/p&gt;
 &lt;p&gt;At the query plan level, Q23 looks like this before it is 
executed:&lt;/p&gt;
 &lt;pre&gt;&lt;code 
class="language-text"&gt;&amp;boxdr;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxdl;
@@ -243,7 +243,7 @@ in an open source query engine like DataFusion.&lt;/p&gt;
 filter is shown as &lt;code&gt;true&lt;/code&gt; in the 
&lt;code&gt;predicate&lt;/code&gt; field of the 
&lt;code&gt;DataSourceExec&lt;/code&gt;
 operator.&lt;/p&gt;
 &lt;p&gt;The dynamic filter is updated by the 
&lt;code&gt;SortExec(TopK)&lt;/code&gt; operator during execution
-as it processes rows, as shown in Figure 6.&lt;/p&gt;
+as shown in Figure 6.&lt;/p&gt;
 &lt;pre&gt;&lt;code 
class="language-text"&gt;&amp;boxdr;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxdl;
 &amp;boxv;       SortExec(TopK)      &amp;boxv;
 &amp;boxv;    --------------------   &amp;boxv;
@@ -411,7 +411,7 @@ make working with dynamic filters more performant for 
specific use cases:&lt;/p&
   support specific static filter patterns (e.g. stats pruning 
rewrites).&lt;/p&gt;
 &lt;/li&gt;
 &lt;/ul&gt;
-&lt;p&gt;This is all implementing in the 
&lt;code&gt;DynamicFilterPhysicalExpr&lt;/code&gt; struct.&lt;/p&gt;
+&lt;p&gt;This is all implemented in the 
&lt;code&gt;DynamicFilterPhysicalExpr&lt;/code&gt; struct.&lt;/p&gt;
 &lt;p&gt;Another important design point was handling concurrency and 
information
 flow. In early designs, the scan polled the source operators on every row /
 batch, which had significant overhead. The final design is a "push" model where
@@ -423,12 +423,12 @@ operator.&lt;/p&gt;
 &lt;h2&gt;Future Work&lt;/h2&gt;
 &lt;p&gt;Although we've made great progress and DataFusion now has one of the 
most
 advanced open-source dynamic filter / sideways information passing
-implementations that we know of, we seemany areas of future improvement such 
as:&lt;/p&gt;
+implementations that we know of, we see many areas of future improvement such 
as:&lt;/p&gt;
 &lt;ul&gt;
 &lt;li&gt;
 &lt;p&gt;&lt;a 
href="https://github.com/apache/datafusion/issues/16973"&gt;Support for more 
types of joins&lt;/a&gt;: This optimization is only implemented for
   &lt;code&gt;INNER&lt;/code&gt; hash joins so far, but it could be 
implemented for other join algorithms
-  (e.g. nested loop joins, ) and join types (e.g. &lt;code&gt;LEFT OUTER 
JOIN&lt;/code&gt;).&lt;/p&gt;
+  (e.g. nested loop joins) and join types (e.g. &lt;code&gt;LEFT OUTER 
JOIN&lt;/code&gt;).&lt;/p&gt;
 &lt;/li&gt;
 &lt;li&gt;
 &lt;p&gt;&lt;a 
href="https://github.com/apache/datafusion/issues/17171"&gt;Push down entire 
hash tables to the scan operator&lt;/a&gt;: Improve the representation
diff --git a/blog/feeds/all-en.atom.xml b/blog/feeds/all-en.atom.xml
index 973ea55..18a70a4 100644
--- a/blog/feeds/all-en.atom.xml
+++ b/blog/feeds/all-en.atom.xml
@@ -214,11 +214,11 @@ data tends to be &lt;em&gt;roughly&lt;/em&gt; sorted 
(e.g. if you append to file
 it) but that does not guarantee that it is fully sorted, either within or 
between
 files. &lt;/p&gt;
 &lt;p&gt;We &lt;a 
href="https://github.com/apache/datafusion/issues/15037"&gt;discussed possible 
solutions&lt;/a&gt; with the community, and ultimately decided to
-implement a generic "dynamic filters", which is general enough to be used in
+implement generic "dynamic filters", which are general enough to be used in
 joins as well (see next section). Our implementation appears very similar to
 recently announced optimizations in closed-source, commercial systems such as
 &lt;a 
href="https://program.berlinbuzzwords.de/bbuzz24/talk/3DTQJB/"&gt;Accelerating 
TopK Queries in Snowflake&lt;/a&gt;, or &lt;a 
href="https://www.alibabacloud.com/blog/about-database-kernel-%7C-learn-about-polardb-imci-optimization-techniques_600274"&gt;self-sharpening
 runtime filters in
-Alibaba Cloud's PolarDB&lt;/a&gt;, and we are excited we can offer similar 
features
+Alibaba Cloud's PolarDB&lt;/a&gt;, and we are excited that we can offer 
similar features
 in an open source query engine like DataFusion.&lt;/p&gt;
 &lt;p&gt;At the query plan level, Q23 looks like this before it is 
executed:&lt;/p&gt;
 &lt;pre&gt;&lt;code 
class="language-text"&gt;&amp;boxdr;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxdl;
@@ -243,7 +243,7 @@ in an open source query engine like DataFusion.&lt;/p&gt;
 filter is shown as &lt;code&gt;true&lt;/code&gt; in the 
&lt;code&gt;predicate&lt;/code&gt; field of the 
&lt;code&gt;DataSourceExec&lt;/code&gt;
 operator.&lt;/p&gt;
 &lt;p&gt;The dynamic filter is updated by the 
&lt;code&gt;SortExec(TopK)&lt;/code&gt; operator during execution
-as it processes rows, as shown in Figure 6.&lt;/p&gt;
+as shown in Figure 6.&lt;/p&gt;
 &lt;pre&gt;&lt;code 
class="language-text"&gt;&amp;boxdr;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxdl;
 &amp;boxv;       SortExec(TopK)      &amp;boxv;
 &amp;boxv;    --------------------   &amp;boxv;
@@ -411,7 +411,7 @@ make working with dynamic filters more performant for 
specific use cases:&lt;/p&
   support specific static filter patterns (e.g. stats pruning 
rewrites).&lt;/p&gt;
 &lt;/li&gt;
 &lt;/ul&gt;
-&lt;p&gt;This is all implementing in the 
&lt;code&gt;DynamicFilterPhysicalExpr&lt;/code&gt; struct.&lt;/p&gt;
+&lt;p&gt;This is all implemented in the 
&lt;code&gt;DynamicFilterPhysicalExpr&lt;/code&gt; struct.&lt;/p&gt;
 &lt;p&gt;Another important design point was handling concurrency and 
information
 flow. In early designs, the scan polled the source operators on every row /
 batch, which had significant overhead. The final design is a "push" model where
@@ -423,12 +423,12 @@ operator.&lt;/p&gt;
 &lt;h2&gt;Future Work&lt;/h2&gt;
 &lt;p&gt;Although we've made great progress and DataFusion now has one of the 
most
 advanced open-source dynamic filter / sideways information passing
-implementations that we know of, we seemany areas of future improvement such 
as:&lt;/p&gt;
+implementations that we know of, we see many areas of future improvement such 
as:&lt;/p&gt;
 &lt;ul&gt;
 &lt;li&gt;
 &lt;p&gt;&lt;a 
href="https://github.com/apache/datafusion/issues/16973"&gt;Support for more 
types of joins&lt;/a&gt;: This optimization is only implemented for
   &lt;code&gt;INNER&lt;/code&gt; hash joins so far, but it could be 
implemented for other join algorithms
-  (e.g. nested loop joins, ) and join types (e.g. &lt;code&gt;LEFT OUTER 
JOIN&lt;/code&gt;).&lt;/p&gt;
+  (e.g. nested loop joins) and join types (e.g. &lt;code&gt;LEFT OUTER 
JOIN&lt;/code&gt;).&lt;/p&gt;
 &lt;/li&gt;
 &lt;li&gt;
 &lt;p&gt;&lt;a 
href="https://github.com/apache/datafusion/issues/17171"&gt;Push down entire 
hash tables to the scan operator&lt;/a&gt;: Improve the representation
diff --git a/blog/feeds/blog.atom.xml b/blog/feeds/blog.atom.xml
index b28dbbb..7626c76 100644
--- a/blog/feeds/blog.atom.xml
+++ b/blog/feeds/blog.atom.xml
@@ -214,11 +214,11 @@ data tends to be &lt;em&gt;roughly&lt;/em&gt; sorted 
(e.g. if you append to file
 it) but that does not guarantee that it is fully sorted, either within or 
between
 files. &lt;/p&gt;
 &lt;p&gt;We &lt;a 
href="https://github.com/apache/datafusion/issues/15037"&gt;discussed possible 
solutions&lt;/a&gt; with the community, and ultimately decided to
-implement a generic "dynamic filters", which is general enough to be used in
+implement generic "dynamic filters", which are general enough to be used in
 joins as well (see next section). Our implementation appears very similar to
 recently announced optimizations in closed-source, commercial systems such as
 &lt;a 
href="https://program.berlinbuzzwords.de/bbuzz24/talk/3DTQJB/"&gt;Accelerating 
TopK Queries in Snowflake&lt;/a&gt;, or &lt;a 
href="https://www.alibabacloud.com/blog/about-database-kernel-%7C-learn-about-polardb-imci-optimization-techniques_600274"&gt;self-sharpening
 runtime filters in
-Alibaba Cloud's PolarDB&lt;/a&gt;, and we are excited we can offer similar 
features
+Alibaba Cloud's PolarDB&lt;/a&gt;, and we are excited that we can offer 
similar features
 in an open source query engine like DataFusion.&lt;/p&gt;
 &lt;p&gt;At the query plan level, Q23 looks like this before it is 
executed:&lt;/p&gt;
 &lt;pre&gt;&lt;code 
class="language-text"&gt;&amp;boxdr;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxdl;
@@ -243,7 +243,7 @@ in an open source query engine like DataFusion.&lt;/p&gt;
 filter is shown as &lt;code&gt;true&lt;/code&gt; in the 
&lt;code&gt;predicate&lt;/code&gt; field of the 
&lt;code&gt;DataSourceExec&lt;/code&gt;
 operator.&lt;/p&gt;
 &lt;p&gt;The dynamic filter is updated by the 
&lt;code&gt;SortExec(TopK)&lt;/code&gt; operator during execution
-as it processes rows, as shown in Figure 6.&lt;/p&gt;
+as shown in Figure 6.&lt;/p&gt;
 &lt;pre&gt;&lt;code 
class="language-text"&gt;&amp;boxdr;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxh;&amp;boxdl;
 &amp;boxv;       SortExec(TopK)      &amp;boxv;
 &amp;boxv;    --------------------   &amp;boxv;
@@ -411,7 +411,7 @@ make working with dynamic filters more performant for 
specific use cases:&lt;/p&
   support specific static filter patterns (e.g. stats pruning 
rewrites).&lt;/p&gt;
 &lt;/li&gt;
 &lt;/ul&gt;
-&lt;p&gt;This is all implementing in the 
&lt;code&gt;DynamicFilterPhysicalExpr&lt;/code&gt; struct.&lt;/p&gt;
+&lt;p&gt;This is all implemented in the 
&lt;code&gt;DynamicFilterPhysicalExpr&lt;/code&gt; struct.&lt;/p&gt;
 &lt;p&gt;Another important design point was handling concurrency and 
information
 flow. In early designs, the scan polled the source operators on every row /
 batch, which had significant overhead. The final design is a "push" model where
@@ -423,12 +423,12 @@ operator.&lt;/p&gt;
 &lt;h2&gt;Future Work&lt;/h2&gt;
 &lt;p&gt;Although we've made great progress and DataFusion now has one of the 
most
 advanced open-source dynamic filter / sideways information passing
-implementations that we know of, we seemany areas of future improvement such 
as:&lt;/p&gt;
+implementations that we know of, we see many areas of future improvement such 
as:&lt;/p&gt;
 &lt;ul&gt;
 &lt;li&gt;
 &lt;p&gt;&lt;a 
href="https://github.com/apache/datafusion/issues/16973"&gt;Support for more 
types of joins&lt;/a&gt;: This optimization is only implemented for
   &lt;code&gt;INNER&lt;/code&gt; hash joins so far, but it could be 
implemented for other join algorithms
-  (e.g. nested loop joins, ) and join types (e.g. &lt;code&gt;LEFT OUTER 
JOIN&lt;/code&gt;).&lt;/p&gt;
+  (e.g. nested loop joins) and join types (e.g. &lt;code&gt;LEFT OUTER 
JOIN&lt;/code&gt;).&lt;/p&gt;
 &lt;/li&gt;
 &lt;li&gt;
 &lt;p&gt;&lt;a 
href="https://github.com/apache/datafusion/issues/17171"&gt;Push down entire 
hash tables to the scan operator&lt;/a&gt;: Improve the representation


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@datafusion.apache.org
For additional commands, e-mail: commits-h...@datafusion.apache.org

(datafusion-site) branch asf-staging updated: Commit build products

Reply via email to