nuno-faria commented on PR #103:
URL: https://github.com/apache/datafusion-site/pull/103#issuecomment-3262612288

   > I just gave this a read through and think it's looking great! I'd like to 
add a benchmark showing join performance numbers (@nuno-faria I think you had 
something already, would you like to use it and we can include you as an 
author?), otherwise I think this is g2g!
   
   Yes I can share some ad-hoc tests, using a simple join query with TPC-H data 
(sf=20). The ideal execution plan for the following query is to first filter 
`customer` by `c_phone` and then use the resulting data to filter `orders`, 
which is now possible with the dynamic filter pushdown:
   
   ```sql
   EXPLAIN ANALYZE
   SELECT * 
   FROM customer
   JOIN orders on c_custkey = o_custkey
   WHERE c_phone = '25-989-741-2988';
   ```
   
   At `49.0.2`, the entire `orders` table is retrieved:
   ```sql
   
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | plan_type         | plan                                                   
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                        |
   
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | Plan with Metrics | CoalesceBatchesExec: target_batch_size=8192, 
metrics=[output_rows=11, elapsed_compute=15.756209ms]                           
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                        |
   |                   |   HashJoinExec: mode=Partitioned, join_type=Inner, 
on=[(c_custkey@0, o_custkey@1)], metrics=[output_rows=11, 
elapsed_compute=144.066417ms, build_input_batches=1, build_input_rows=1, 
input_batches=3522, input_rows=30000000, output_batches=3811, 
build_mem_used=1096, build_time=3.5554ms, join_time=137.192905ms]               
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
         
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                        |
   |                   |     CoalesceBatchesExec: target_batch_size=8192, 
metrics=[output_rows=1, elapsed_compute=15.606µs]                               
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                           
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                         |
   |                   |       RepartitionExec: 
partitioning=Hash([c_custkey@0], 12), input_partitions=12, 
metrics=[fetch_time=2.0014852s, repartition_time=17.211µs, send_time=7.143µs]   
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
         
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                          |
   |                   |         CoalesceBatchesExec: target_batch_size=8192, 
metrics=[output_rows=1, elapsed_compute=33.404µs]                               
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                         |
   |                   |           FilterExec: c_phone@4 = 25-989-741-2988, 
metrics=[output_rows=1, elapsed_compute=28.9586ms]                              
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                          
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                        |
   |                   |             DataSourceExec: file_groups={12 groups: 
[[customer.parquet:0..20627576], [customer.parquet:20627576..41255152], 
[customer.parquet:41255152..61882728], [customer.parquet:61882728..82510304], 
[customer.parquet:82510304..103137880], ...]}, projection=[c_custkey, c_name, 
c_address, c_nationkey, c_phone, c_acctbal, c_mktsegment, c_comment], 
file_type=parquet, predicate=c_phone@4 = 25-989-741-2988, 
pruning_predicate=c_phone_null_count@2 != row_count@3 AND c_phone_min@0 <= 
25-989-741-2988 AND 25-989-741-2988 <= c_phone_max@1, 
required_guarantees=[c_phone in (25-989-741-2988)]                              
                                                                                
                                                                                
                                                                                
                                                                                
                                    
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                        |
   |                   | , metrics=[output_rows=3000000, elapsed_compute=12ns, 
batches_splitted=0, bytes_scanned=247499357, file_open_errors=0, 
file_scan_errors=0, files_ranges_pruned_statistics=0, 
num_predicate_creation_errors=0, page_index_rows_matched=0, 
page_index_rows_pruned=0, predicate_evaluation_errors=0, 
pushdown_rows_matched=0, pushdown_rows_pruned=0, 
row_groups_matched_bloom_filter=0, row_groups_matched_statistics=25, 
row_groups_pruned_bloom_filter=0, row_groups_pruned_statistics=0, 
bloom_filter_eval_time=503.212µs, metadata_load_time=36.498512ms, 
page_index_eval_time=420ns, row_pushdown_eval_time=24ns, 
statistics_eval_time=382.712µs, time_elapsed_opening=42.3593ms, 
time_elapsed_processing=1.4428838s, time_elapsed_scanning_total=1.9582305s, 
time_elapsed_scanning_until_data=901.7049ms]                                    
                                                                                
                                                                          
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                          |
   |                   |     CoalesceBatchesExec: target_batch_size=8192, 
metrics=[output_rows=30000000, elapsed_compute=14.531075907s]                   
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                            
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                        |
   |                   |       RepartitionExec: 
partitioning=Hash([o_custkey@1], 12), input_partitions=12, 
metrics=[fetch_time=10.0157432s, repartition_time=4.3250056s, 
send_time=5.8247922s]                                                           
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                             
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                        |
   |                   |         DataSourceExec: file_groups={12 groups: 
[[orders.parquet:0..105903266], [orders.parquet:105903266..211806532], 
[orders.parquet:211806532..317709798], [orders.parquet:317709798..423613064], 
[orders.parquet:423613064..529516330], ...]}, projection=[o_orderkey, 
o_custkey, o_orderstatus, o_totalprice, o_orderdate, o_orderpriority, o_clerk, 
o_shippriority, o_comment], file_type=parquet, metrics=[output_rows=30000000, 
elapsed_compute=12ns, batches_splitted=0, bytes_scanned=1269546359, 
file_open_errors=0, file_scan_errors=0, files_ranges_pruned_statistics=0, 
num_predicate_creation_errors=0, page_index_rows_matched=0, 
page_index_rows_pruned=0, predicate_evaluation_errors=0, 
pushdown_rows_matched=0, pushdown_rows_pruned=0, 
row_groups_matched_bloom_filter=0, row_groups_matched_statistics=0, 
row_groups_pruned_bloom_filter=0, row_groups_pruned_statistics=0, 
bloom_filter_eval_time=24ns, metadata_load_time=124.885412ms, 
page_index_eval_time=24ns, row_pushdown_eval_
 time=24ns, statistics_eval_time=24ns, time_elapsed_opening=127.657ms, 
time_elapsed_processing=8.2648902s, time_elapsed_scanning_total=27.819907s, 
time_elapsed_scanning_until_data=432.8593ms]                                    
                                                                                
                                                                                
                                                                                
     |
   |                   |                                                        
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                        |
   
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   2.628s
   output_rows from orders = 30M of 30M
   ```
   
   At `50.0.0`, the dynamic filter will be able to cut down a large percentage 
of the table:
   ```sql
   
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | plan_type         | plan                                                   
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
             |
   
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | Plan with Metrics | CoalesceBatchesExec: target_batch_size=8192, 
metrics=[output_rows=11, elapsed_compute=1.219905ms]                            
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
             |
   |                   |   HashJoinExec: mode=Partitioned, join_type=Inner, 
on=[(c_custkey@0, o_custkey@1)], metrics=[output_rows=11, 
elapsed_compute=47.499214ms, build_input_batches=1, build_input_rows=1, 
input_batches=437, input_rows=3686400, output_batches=473, build_mem_used=1096, 
build_time=19.4876ms, join_time=8.703002ms]                                     
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                        
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
             |
   |                   |     CoalesceBatchesExec: target_batch_size=8192, 
metrics=[output_rows=1, elapsed_compute=13.004µs]                               
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                           
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              |
   |                   |       RepartitionExec: 
partitioning=Hash([c_custkey@0], 12), input_partitions=12, 
metrics=[fetch_time=1.9160576s, repartition_time=10.611µs, send_time=5.443µs]   
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
         
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
               |
   |                   |         CoalesceBatchesExec: target_batch_size=8192, 
metrics=[output_rows=1, elapsed_compute=25.906µs]                               
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              |
   |                   |           FilterExec: c_phone@4 = 25-989-741-2988, 
metrics=[output_rows=1, elapsed_compute=13.0737ms]                              
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                          
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
             |
   |                   |             DataSourceExec: file_groups={12 groups: 
[[customer.parquet:0..20627576], [customer.parquet:20627576..41255152], 
[customer.parquet:41255152..61882728], [customer.parquet:61882728..82510304], 
[customer.parquet:82510304..103137880], ...]}, projection=[c_custkey, c_name, 
c_address, c_nationkey, c_phone, c_acctbal, c_mktsegment, c_comment], 
file_type=parquet, predicate=c_phone@4 = 25-989-741-2988, 
pruning_predicate=c_phone_null_count@2 != row_count@3 AND c_phone_min@0 <= 
25-989-741-2988 AND 25-989-741-2988 <= c_phone_max@1, 
required_guarantees=[c_phone in (25-989-741-2988)], 
metrics=[output_rows=3000000, elapsed_compute=12ns, batches_split=0, 
bytes_scanned=247499357, file_open_errors=0, file_scan_errors=0, 
files_ranges_pruned_statistics=0, num_predicate_creation_errors=0, 
page_index_rows_matched=0, page_index_rows_pruned=0, 
predicate_evaluation_errors=0, pushdown_rows_matched=0, pushdown_rows_pruned=0, 
row_groups_matched_bloom_filter=0, row_groups_matc
 hed_statistics=25, row_groups_pruned_bloom_filter=0, 
row_groups_pruned_statistics=0, bloom_filter_eval_time=583.912µs, 
metadata_load_time=4.968212ms, page_index_eval_time=420ns, 
row_pushdown_eval_time=24ns, statistics_eval_time=491.812µs, 
time_elapsed_opening=11.2265ms, time_elapsed_processing=1.4416089s, 
time_elapsed_scanning_total=1.9039586s, 
time_elapsed_scanning_until_data=894.5533ms]                                    
                                                                                
                                                                                
                                                                                
                                                                 |
   |                   |     CoalesceBatchesExec: target_batch_size=8192, 
metrics=[output_rows=3686400, elapsed_compute=654.916706ms]                     
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                            
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
             |
   |                   |       RepartitionExec: 
partitioning=Hash([o_custkey@1], 12), input_partitions=12, 
metrics=[fetch_time=786.0281ms, repartition_time=215.841606ms, 
send_time=50.588072ms]                                                          
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                            
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
             |
   |                   |         DataSourceExec: file_groups={12 groups: 
[[orders.parquet:0..105903266], [orders.parquet:105903266..211806532], 
[orders.parquet:211806532..317709798], [orders.parquet:317709798..423613064], 
[orders.parquet:423613064..529516330], ...]}, projection=[o_orderkey, 
o_custkey, o_orderstatus, o_totalprice, o_orderdate, o_orderpriority, o_clerk, 
o_shippriority, o_comment], file_type=parquet, 
predicate=DynamicFilterPhysicalExpr [ o_custkey@1 >= 1 AND o_custkey@1 <= 1 ], 
pruning_predicate=o_custkey_null_count@1 != row_count@2 AND o_custkey_max@0 >= 
1 AND o_custkey_null_count@1 != row_count@2 AND o_custkey_min@3 <= 1, 
required_guarantees=[], metrics=[output_rows=3686400, elapsed_compute=12ns, 
batches_split=0, bytes_scanned=155993981, file_open_errors=0, 
file_scan_errors=0, files_ranges_pruned_statistics=0, 
num_predicate_creation_errors=0, page_index_rows_matched=0, 
page_index_rows_pruned=0, predicate_evaluation_errors=0, 
pushdown_rows_matched=0, pushdown_rows_prun
 ed=0, row_groups_matched_bloom_filter=0, row_groups_matched_statistics=9, 
row_groups_pruned_bloom_filter=0, row_groups_pruned_statistics=215, 
bloom_filter_eval_time=189.513µs, metadata_load_time=178.664112ms, 
page_index_eval_time=717ns, row_pushdown_eval_time=24ns, 
statistics_eval_time=429.513µs, time_elapsed_opening=181.4318ms, 
time_elapsed_processing=691.7683ms, time_elapsed_scanning_total=895.1489ms, 
time_elapsed_scanning_until_data=137.6795ms]                                    
                                                                                
                                                                                
                                                                                
     |
   |                   |                                                        
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
             |
   
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   0.688s
   output_rows from orders = 3.6M of 30M
   ```
   
   However, the percentage of the table that is filtered out varies from 
execution to execution, since the build and probe sides are built concurrently, 
and sometimes more data is retrieved from the probe side before the dynamic 
filter is complete (additional context: 
https://github.com/apache/datafusion/pull/17197#issuecomment-3198010057). To 
improve this, we can enable `datafusion.execution.parquet.pushdown_filters`:
   
   ```sql
   set datafusion.execution.parquet.pushdown_filters = true;
   ```
   
   Now, the dynamic filter consistently filters out the entire table, selecting 
only the relevant rows:
   ```sql
   
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | plan_type         | plan                                                   
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                  |
   
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | Plan with Metrics | CoalesceBatchesExec: target_batch_size=8192, 
metrics=[output_rows=11, elapsed_compute=6.905µs]                               
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                               
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                   |
   |                   |   HashJoinExec: mode=Partitioned, join_type=Inner, 
on=[(c_custkey@0, o_custkey@1)], metrics=[output_rows=11, 
elapsed_compute=4.379914ms, build_input_batches=1, build_input_rows=1, 
input_batches=1, input_rows=11, output_batches=1, build_mem_used=1096, 
build_time=2.2824ms, join_time=32.702µs]                                        
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
 
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                   |
   |                   |     CoalesceBatchesExec: target_batch_size=8192, 
metrics=[output_rows=1, elapsed_compute=20.806µs]                               
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                           
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                   |
   |                   |       RepartitionExec: 
partitioning=Hash([c_custkey@0], 12), input_partitions=12, 
metrics=[fetch_time=245.4898ms, repartition_time=15.111µs, send_time=5.543µs]   
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
         
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                    |
   |                   |         DataSourceExec: file_groups={12 groups: 
[[customer.parquet:0..20627576], [customer.parquet:20627576..41255152], 
[customer.parquet:41255152..61882728], [customer.parquet:61882728..82510304], 
[customer.parquet:82510304..103137880], ...]}, projection=[c_custkey, c_name, 
c_address, c_nationkey, c_phone, c_acctbal, c_mktsegment, c_comment], 
file_type=parquet, predicate=c_phone@4 = 25-989-741-2988, 
pruning_predicate=c_phone_null_count@2 != row_count@3 AND c_phone_min@0 <= 
25-989-741-2988 AND 25-989-741-2988 <= c_phone_max@1, 
required_guarantees=[c_phone in (25-989-741-2988)], metrics=[output_rows=1, 
elapsed_compute=12ns, batches_split=0, bytes_scanned=44262371, 
file_open_errors=0, file_scan_errors=0, files_ranges_pruned_statistics=0, 
num_predicate_creation_errors=0, page_index_rows_matched=0, 
page_index_rows_pruned=0, predicate_evaluation_errors=0, 
pushdown_rows_matched=1, pushdown_rows_pruned=2999999, 
row_groups_matched_bloom_filter=0, row_groups_matched_s
 tatistics=25, row_groups_pruned_bloom_filter=0, 
row_groups_pruned_statistics=0, bloom_filter_eval_time=604.112µs, 
metadata_load_time=3.295812ms, page_index_eval_time=520ns, 
row_pushdown_eval_time=5.210212ms, statistics_eval_time=344.112µs, 
time_elapsed_opening=9.6014ms, time_elapsed_processing=162.9916ms, 
time_elapsed_scanning_total=235.7456ms, 
time_elapsed_scanning_until_data=228.1358ms]                                    
                                                                                
                                                                                
                                                                                
                                                                      |
   |                   |     CoalesceBatchesExec: target_batch_size=8192, 
metrics=[output_rows=11, elapsed_compute=161.908µs]                             
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                           
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                   |
   |                   |       RepartitionExec: 
partitioning=Hash([o_custkey@1], 12), input_partitions=12, 
metrics=[fetch_time=278.4473ms, repartition_time=211.906µs, send_time=43.638µs] 
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
         
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                    |
   |                   |         DataSourceExec: file_groups={12 groups: 
[[orders.parquet:0..105903266], [orders.parquet:105903266..211806532], 
[orders.parquet:211806532..317709798], [orders.parquet:317709798..423613064], 
[orders.parquet:423613064..529516330], ...]}, projection=[o_orderkey, 
o_custkey, o_orderstatus, o_totalprice, o_orderdate, o_orderpriority, o_clerk, 
o_shippriority, o_comment], file_type=parquet, 
predicate=DynamicFilterPhysicalExpr [ o_custkey@1 >= 1 AND o_custkey@1 <= 1 ], 
pruning_predicate=o_custkey_null_count@1 != row_count@2 AND o_custkey_max@0 >= 
1 AND o_custkey_null_count@1 != row_count@2 AND o_custkey_min@3 <= 1, 
required_guarantees=[], metrics=[output_rows=11, elapsed_compute=12ns, 
batches_split=0, bytes_scanned=68872932, file_open_errors=0, 
file_scan_errors=0, files_ranges_pruned_statistics=0, 
num_predicate_creation_errors=0, page_index_rows_matched=0, 
page_index_rows_pruned=0, predicate_evaluation_errors=0, 
pushdown_rows_matched=11, pushdown_rows_pruned=36
 86389, row_groups_matched_bloom_filter=0, row_groups_matched_statistics=10, 
row_groups_pruned_bloom_filter=0, row_groups_pruned_statistics=215, 
bloom_filter_eval_time=231.613µs, metadata_load_time=35.709112ms, 
page_index_eval_time=519ns, row_pushdown_eval_time=6.038918ms, 
statistics_eval_time=361.613µs, time_elapsed_opening=38.8879ms, 
time_elapsed_processing=224.2144ms, time_elapsed_scanning_total=239.723ms, 
time_elapsed_scanning_until_data=131.8035ms]                                    
                                                                                
                                                                                
                                                                                
     |
   |                   |                                                        
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                  |
   
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   0.104s
   output_rows from orders = 11 of 30M
   ```
   
   This results in a >20x speed up. Here is an overview (the `pushdown_filters` 
was enabled in 49.0.2 for consistency, but does not change the runtime 
considerably):
   
   ```
   
┌────────────────────────────┬────────────────────┬─────────────────────────────────┐
   │                            │ Execution time (s) │ Filtered out rows from 
"orders" │
   
├────────────────────────────┼────────────────────┼─────────────────────────────────┤
   │           49.0.2           │       2.521        │               0%         
       │
   │  [pushdown_filters=true]   │                    │                          
       │
   
├────────────────────────────┼────────────────────┼─────────────────────────────────┤
   │           50.0.0           │       0.688        │              ~88%        
       │
   │ [dynamic_filter_pushdown]  │                    │                          
       │
   
├────────────────────────────┼────────────────────┼─────────────────────────────────┤
   │           50.0.0           │       0.104        │              ~100%       
       │
   │ [dynamic_filter_pushdown   │                    │                          
       │
   │ & pushdown_filters=true]   │                    │                          
       │
   
└────────────────────────────┴────────────────────┴─────────────────────────────────┘
   ```
   
   <img width="786" height="365" alt="image" 
src="https://github.com/user-attachments/assets/e32ba7bb-4f5a-400f-bbb8-277d859a98a6";
 />
   
   I also tried to show that using multiple filters would result in good 
performance. However, I found some inconclusive results. For example, here is 
the same query as the one above, but with a `c_phone = '25-989-741-2988' or 
c_phone = '29-590-168-8634'` filter:
   
   ```sql
   
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | plan_type         | plan                                                   
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                            |
   
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | Plan with Metrics | CoalesceBatchesExec: target_batch_size=8192, 
metrics=[output_rows=26, elapsed_compute=12.708µs]                              
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                               
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                             |
   |                   |   HashJoinExec: mode=Partitioned, join_type=Inner, 
on=[(c_custkey@0, o_custkey@1)], metrics=[output_rows=26, 
elapsed_compute=5.897813ms, build_input_batches=2, build_input_rows=2, 
input_batches=2, input_rows=26, output_batches=2, build_mem_used=1345, 
build_time=3.0268ms, join_time=59.501µs]                                        
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
 
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                             |
   |                   |     CoalesceBatchesExec: target_batch_size=8192, 
metrics=[output_rows=2, elapsed_compute=46.206µs]                               
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                           
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                             |
   |                   |       RepartitionExec: 
partitioning=Hash([c_custkey@0], 12), input_partitions=12, 
metrics=[fetch_time=270.5539ms, repartition_time=30.21µs, send_time=9.042µs]    
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
         
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                              |
   |                   |         DataSourceExec: file_groups={12 groups: 
[[customer.parquet:0..20627576], [customer.parquet:20627576..41255152], 
[customer.parquet:41255152..61882728], [customer.parquet:61882728..82510304], 
[customer.parquet:82510304..103137880], ...]}, projection=[c_custkey, c_name, 
c_address, c_nationkey, c_phone, c_acctbal, c_mktsegment, c_comment], 
file_type=parquet, predicate=c_phone@4 = 25-989-741-2988 OR c_phone@4 = 
29-590-168-8634, pruning_predicate=c_phone_null_count@2 != row_count@3 AND 
c_phone_min@0 <= 25-989-741-2988 AND 25-989-741-2988 <= c_phone_max@1 OR 
c_phone_null_count@2 != row_count@3 AND c_phone_min@0 <= 29-590-168-8634 AND 
29-590-168-8634 <= c_phone_max@1, required_guarantees=[c_phone in 
(25-989-741-2988, 29-590-168-8634)], metrics=[output_rows=2, 
elapsed_compute=12ns, batches_split=0, bytes_scanned=47855367, 
file_open_errors=0, file_scan_errors=0, files_ranges_pruned_statistics=0, 
num_predicate_creation_errors=0, page_index_rows_matched=0, page_i
 ndex_rows_pruned=0, predicate_evaluation_errors=0, pushdown_rows_matched=2, 
pushdown_rows_pruned=2999998, row_groups_matched_bloom_filter=0, 
row_groups_matched_statistics=25, row_groups_pruned_bloom_filter=0, 
row_groups_pruned_statistics=0, bloom_filter_eval_time=629.912µs, 
metadata_load_time=2.153612ms, page_index_eval_time=519ns, 
row_pushdown_eval_time=9.425812ms, statistics_eval_time=257.312µs, 
time_elapsed_opening=6.6487ms, time_elapsed_processing=178.9948ms, 
time_elapsed_scanning_total=263.8138ms, 
time_elapsed_scanning_until_data=255.8887ms]                                    
                                                                                
                                                                                
                                                                                
                                                                                
                               |
   |                   |     CoalesceBatchesExec: target_batch_size=8192, 
metrics=[output_rows=26, elapsed_compute=606.205µs]                             
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                           
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                             |
   |                   |       RepartitionExec: 
partitioning=Hash([o_custkey@1], 12), input_partitions=12, 
metrics=[fetch_time=952.4997ms, repartition_time=720.001µs, 
send_time=177.128µs]                                                            
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                             
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                              |
   |                   |         DataSourceExec: file_groups={12 groups: 
[[orders.parquet:0..105903266], [orders.parquet:105903266..211806532], 
[orders.parquet:211806532..317709798], [orders.parquet:317709798..423613064], 
[orders.parquet:423613064..529516330], ...]}, projection=[o_orderkey, 
o_custkey, o_orderstatus, o_totalprice, o_orderdate, o_orderpriority, o_clerk, 
o_shippriority, o_comment], file_type=parquet, 
predicate=DynamicFilterPhysicalExpr [ o_custkey@1 >= 2999998 AND o_custkey@1 <= 
2999998 OR o_custkey@1 >= 1 AND o_custkey@1 <= 1 ], 
pruning_predicate=o_custkey_null_count@1 != row_count@2 AND o_custkey_max@0 >= 
2999998 AND o_custkey_null_count@1 != row_count@2 AND o_custkey_min@3 <= 
2999998 OR o_custkey_null_count@1 != row_count@2 AND o_custkey_max@0 >= 1 AND 
o_custkey_null_count@1 != row_count@2 AND o_custkey_min@3 <= 1, 
required_guarantees=[], metrics=[output_rows=26, elapsed_compute=12ns, 
batches_split=0, bytes_scanned=182217584, file_open_errors=0, 
file_scan_errors=0, f
 iles_ranges_pruned_statistics=0, num_predicate_creation_errors=0, 
page_index_rows_matched=0, page_index_rows_pruned=0, 
predicate_evaluation_errors=0, pushdown_rows_matched=26, 
pushdown_rows_pruned=13516774, row_groups_matched_bloom_filter=0, 
row_groups_matched_statistics=28, row_groups_pruned_bloom_filter=0, 
row_groups_pruned_statistics=135, bloom_filter_eval_time=855.716µs, 
metadata_load_time=21.491212ms, page_index_eval_time=916ns, 
row_pushdown_eval_time=42.977712ms, statistics_eval_time=515.616µs, 
time_elapsed_opening=27.1932ms, time_elapsed_processing=644.1071ms, 
time_elapsed_scanning_total=926.2487ms, 
time_elapsed_scanning_until_data=504.9686ms]                                    
                                                                                
                                                                                
                                                                                
     |
   |                   |                                                        
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                            |
   
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   0.166s
   output_rows from orders = 26 of 30M
   [ o_custkey@1 >= 2999998 AND o_custkey@1 <= 2999998 OR o_custkey@1 >= 1 AND 
o_custkey@1 <= 1 ]
   ```
   
   The dynamic filter is pretty efficient. However, consider the same query, 
but using `c_phone = '25-989-741-2988' or c_phone = '29-699-674-3526'` as the 
filter:
   ```sql
   
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | plan_type         | plan                                                   
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                    |
   
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | Plan with Metrics | CoalesceBatchesExec: target_batch_size=8192, 
metrics=[output_rows=35, elapsed_compute=15.718506ms]                           
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                    |
   |                   |   HashJoinExec: mode=Partitioned, join_type=Inner, 
on=[(c_custkey@0, o_custkey@1)], metrics=[output_rows=35, 
elapsed_compute=147.439515ms, build_input_batches=1, build_input_rows=2, 
input_batches=3520, input_rows=30000000, output_batches=3810, 
build_mem_used=1516, build_time=4.1056ms, join_time=139.447503ms]               
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
         
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                    |
   |                   |     CoalesceBatchesExec: target_batch_size=8192, 
metrics=[output_rows=2, elapsed_compute=50.408µs]                               
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                           
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                     |
   |                   |       RepartitionExec: 
partitioning=Hash([c_custkey@0], 12), input_partitions=12, 
metrics=[fetch_time=289.6309ms, repartition_time=40.51µs, send_time=10.542µs]   
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
         
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      |
   |                   |         DataSourceExec: file_groups={12 groups: 
[[customer.parquet:0..20627576], [customer.parquet:20627576..41255152], 
[customer.parquet:41255152..61882728], [customer.parquet:61882728..82510304], 
[customer.parquet:82510304..103137880], ...]}, projection=[c_custkey, c_name, 
c_address, c_nationkey, c_phone, c_acctbal, c_mktsegment, c_comment], 
file_type=parquet, predicate=c_phone@4 = 25-989-741-2988 OR c_phone@4 = 
29-699-674-3526, pruning_predicate=c_phone_null_count@2 != row_count@3 AND 
c_phone_min@0 <= 25-989-741-2988 AND 25-989-741-2988 <= c_phone_max@1 OR 
c_phone_null_count@2 != row_count@3 AND c_phone_min@0 <= 29-699-674-3526 AND 
29-699-674-3526 <= c_phone_max@1, required_guarantees=[c_phone in 
(25-989-741-2988, 29-699-674-3526)], metrics=[output_rows=2, 
elapsed_compute=12ns, batches_split=0, bytes_scanned=47855367, 
file_open_errors=0, file_scan_errors=0, files_ranges_pruned_statistics=0, 
num_predicate_creation_errors=0, page_index_rows_matched=0, page_i
 ndex_rows_pruned=0, predicate_evaluation_errors=0, pushdown_rows_matched=2, 
pushdown_rows_pruned=2999998, row_groups_matched_bloom_filter=0, 
row_groups_matched_statistics=25, row_groups_pruned_bloom_filter=0, 
row_groups_pruned_statistics=0, bloom_filter_eval_time=999.212µs, 
metadata_load_time=16.355212ms, page_index_eval_time=817ns, 
row_pushdown_eval_time=14.076612ms, statistics_eval_time=765.912µs, 
time_elapsed_opening=28.8661ms, time_elapsed_processing=224.1269ms, 
time_elapsed_scanning_total=260.453ms, 
time_elapsed_scanning_until_data=252.3484ms]                                    
                                                                                
                                                                                
                                                                                
     |
   |                   |     CoalesceBatchesExec: target_batch_size=8192, 
metrics=[output_rows=30000000, elapsed_compute=13.705637305s]                   
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                            
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                    |
   |                   |       RepartitionExec: 
partitioning=Hash([o_custkey@1], 12), input_partitions=12, 
metrics=[fetch_time=11.5848169s, repartition_time=4.4595235s, 
send_time=5.2565265s]                                                           
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                             
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                    |
   |                   |         DataSourceExec: file_groups={12 groups: 
[[orders.parquet:0..105903266], [orders.parquet:105903266..211806532], 
[orders.parquet:211806532..317709798], [orders.parquet:317709798..423613064], 
[orders.parquet:423613064..529516330], ...]}, projection=[o_orderkey, 
o_custkey, o_orderstatus, o_totalprice, o_orderdate, o_orderpriority, o_clerk, 
o_shippriority, o_comment], file_type=parquet, 
predicate=DynamicFilterPhysicalExpr [ o_custkey@1 >= 1 AND o_custkey@1 <= 
2999999 ], pruning_predicate=o_custkey_null_count@1 != row_count@2 AND 
o_custkey_max@0 >= 1 AND o_custkey_null_count@1 != row_count@2 AND 
o_custkey_min@3 <= 2999999, required_guarantees=[], 
metrics=[output_rows=30000000, elapsed_compute=12ns, batches_split=0, 
bytes_scanned=1269546359, file_open_errors=0, file_scan_errors=0, 
files_ranges_pruned_statistics=0, num_predicate_creation_errors=0, 
page_index_rows_matched=0, page_index_rows_pruned=0, 
predicate_evaluation_errors=0, pushdown_rows_matched=3000000
 0, pushdown_rows_pruned=0, row_groups_matched_bloom_filter=0, 
row_groups_matched_statistics=245, row_groups_pruned_bloom_filter=0, 
row_groups_pruned_statistics=0, bloom_filter_eval_time=4.080812ms, 
metadata_load_time=29.209612ms, page_index_eval_time=717ns, 
row_pushdown_eval_time=88.763912ms, statistics_eval_time=367.312µs, 
time_elapsed_opening=36.8624ms, time_elapsed_processing=9.1036123s, 
time_elapsed_scanning_total=28.005503s, 
time_elapsed_scanning_until_data=496.3794ms]                                    
                                                                                
                                                                                
                                                                                
                                                                                
 |
   |                   |                                                        
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                    |
   
+-------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   2.508s
   output_rows from orders = 30M of 30M
   [ o_custkey@1 >= 1 AND o_custkey@1 <= 2999999 ]
   ```
   
   The dynamic filter now combines the partial filters, which ends up 
retrieving the entire table. I think it might be worth looking at always 
combining the partial filters with `OR`, unless maybe if the dynamic filter 
becomes too large. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to