This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch asf-staging in repository https://gitbox.apache.org/repos/asf/datafusion-site.git
The following commit(s) were added to refs/heads/asf-staging by this push: new 69e6e0c Commit build products 69e6e0c is described below commit 69e6e0c0c5dc40f87b57ee7118d4d68efce004b0 Author: Build Pelican (action) <priv...@infra.apache.org> AuthorDate: Wed Sep 10 15:26:49 2025 +0000 Commit build products --- blog/2025/09/10/dynamic-filters/index.html | 13 +++++++------ ...arcia-badaracco-pydantic-andrew-lamb-influxdata.atom.xml | 13 +++++++------ blog/feeds/all-en.atom.xml | 13 +++++++------ blog/feeds/blog.atom.xml | 13 +++++++------ 4 files changed, 28 insertions(+), 24 deletions(-) diff --git a/blog/2025/09/10/dynamic-filters/index.html b/blog/2025/09/10/dynamic-filters/index.html index a7a48d9..744ea07 100644 --- a/blog/2025/09/10/dynamic-filters/index.html +++ b/blog/2025/09/10/dynamic-filters/index.html @@ -303,12 +303,13 @@ input and the other input to be the "probe" side.</p> filters.</p> </li> </ul> -<p>Many hash joins are very selective (only a small number of rows are matched), so -it is natural to use the same dynamic filter technique. DataFusion 50.0.0 pushes -down knowledge of what keys exist on the build side into the scan of the probe -side with a dynamic filter based on min/max join key values. For example, if the -build side only has keys in the range <code>[100, 200]</code>, then DataFusion will filter -all probe rows with keys outside that range during the scan.</p> +<p>Many hash joins act as selective filters for rows from the probe side (when only +a small number of rows are matched), so it is natural to use the same dynamic +filter technique. DataFusion 50.0.0 pushes down knowledge of what keys exist on +the build side into the scan of the probe side with a dynamic filter based on +min/max join key values. For example, if the build side only has keys in the +range <code>[100, 200]</code>, then DataFusion will filter out all probe rows with keys +outside that range during the scan.</p> <p>This simple approach is fast to evaluate and the filter improves performance significantly when combined with statistics pruning, late materialization, and other optimizations as shown in Figure 7.</p> diff --git a/blog/feeds/adrian-garcia-badaracco-pydantic-andrew-lamb-influxdata.atom.xml b/blog/feeds/adrian-garcia-badaracco-pydantic-andrew-lamb-influxdata.atom.xml index 306da4d..41e6537 100644 --- a/blog/feeds/adrian-garcia-badaracco-pydantic-andrew-lamb-influxdata.atom.xml +++ b/blog/feeds/adrian-garcia-badaracco-pydantic-andrew-lamb-influxdata.atom.xml @@ -287,12 +287,13 @@ input and the other input to be the "probe" side.</p> filters.</p> </li> </ul> -<p>Many hash joins are very selective (only a small number of rows are matched), so -it is natural to use the same dynamic filter technique. DataFusion 50.0.0 pushes -down knowledge of what keys exist on the build side into the scan of the probe -side with a dynamic filter based on min/max join key values. For example, if the -build side only has keys in the range <code>[100, 200]</code>, then DataFusion will filter -all probe rows with keys outside that range during the scan.</p> +<p>Many hash joins act as selective filters for rows from the probe side (when only +a small number of rows are matched), so it is natural to use the same dynamic +filter technique. DataFusion 50.0.0 pushes down knowledge of what keys exist on +the build side into the scan of the probe side with a dynamic filter based on +min/max join key values. For example, if the build side only has keys in the +range <code>[100, 200]</code>, then DataFusion will filter out all probe rows with keys +outside that range during the scan.</p> <p>This simple approach is fast to evaluate and the filter improves performance significantly when combined with statistics pruning, late materialization, and other optimizations as shown in Figure 7.</p> diff --git a/blog/feeds/all-en.atom.xml b/blog/feeds/all-en.atom.xml index 02ca4db..224fc99 100644 --- a/blog/feeds/all-en.atom.xml +++ b/blog/feeds/all-en.atom.xml @@ -287,12 +287,13 @@ input and the other input to be the "probe" side.</p> filters.</p> </li> </ul> -<p>Many hash joins are very selective (only a small number of rows are matched), so -it is natural to use the same dynamic filter technique. DataFusion 50.0.0 pushes -down knowledge of what keys exist on the build side into the scan of the probe -side with a dynamic filter based on min/max join key values. For example, if the -build side only has keys in the range <code>[100, 200]</code>, then DataFusion will filter -all probe rows with keys outside that range during the scan.</p> +<p>Many hash joins act as selective filters for rows from the probe side (when only +a small number of rows are matched), so it is natural to use the same dynamic +filter technique. DataFusion 50.0.0 pushes down knowledge of what keys exist on +the build side into the scan of the probe side with a dynamic filter based on +min/max join key values. For example, if the build side only has keys in the +range <code>[100, 200]</code>, then DataFusion will filter out all probe rows with keys +outside that range during the scan.</p> <p>This simple approach is fast to evaluate and the filter improves performance significantly when combined with statistics pruning, late materialization, and other optimizations as shown in Figure 7.</p> diff --git a/blog/feeds/blog.atom.xml b/blog/feeds/blog.atom.xml index d0a972a..8f5c701 100644 --- a/blog/feeds/blog.atom.xml +++ b/blog/feeds/blog.atom.xml @@ -287,12 +287,13 @@ input and the other input to be the "probe" side.</p> filters.</p> </li> </ul> -<p>Many hash joins are very selective (only a small number of rows are matched), so -it is natural to use the same dynamic filter technique. DataFusion 50.0.0 pushes -down knowledge of what keys exist on the build side into the scan of the probe -side with a dynamic filter based on min/max join key values. For example, if the -build side only has keys in the range <code>[100, 200]</code>, then DataFusion will filter -all probe rows with keys outside that range during the scan.</p> +<p>Many hash joins act as selective filters for rows from the probe side (when only +a small number of rows are matched), so it is natural to use the same dynamic +filter technique. DataFusion 50.0.0 pushes down knowledge of what keys exist on +the build side into the scan of the probe side with a dynamic filter based on +min/max join key values. For example, if the build side only has keys in the +range <code>[100, 200]</code>, then DataFusion will filter out all probe rows with keys +outside that range during the scan.</p> <p>This simple approach is fast to evaluate and the filter improves performance significantly when combined with statistics pruning, late materialization, and other optimizations as shown in Figure 7.</p> --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@datafusion.apache.org For additional commands, e-mail: commits-h...@datafusion.apache.org