[I] Push limit into joins [datafusion]

via GitHub Mon, 27 Oct 2025 03:14:15 -0700


2010YOUY01 opened a new issue, #18295:
URL: https://github.com/apache/datafusion/issues/18295


   ### Is your feature request related to a problem or challenge?
   
   Related to https://github.com/apache/datafusion/issues/15628
   
   For queries like
   ```
   select *
   from t1
   join t2
   on expensive_and_selective_predicate(t1.v1, t2.v1)
   limit 10
   ```
   
   The query plan will look like:
   ```
   -- For simpler demonstration
   > set datafusion.execution.target_partitions = 1;
   0 row(s) fetched.
   Elapsed 0.000 seconds.
   
   > explain analyze select *
   from generate_series(100000) as t1(v1)
   join generate_series(100000) as t2(v1)
   on t1.v1 < t2.v1
   limit 10;
   
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | plan_type         | plan                                                   
                                                                                
                                                                                
                                                                 |
   
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | Plan with Metrics | GlobalLimitExec: skip=0, fetch=10, 
metrics=[output_rows=10, elapsed_compute=5.917µs]                               
                                                                                
                                                                                
     |
   |                   |   NestedLoopJoinExec: join_type=Inner, filter=v1@0 < 
v1@1, metrics=[output_rows=8192, elapsed_compute=582.002µs, 
build_input_batches=13, build_input_rows=100001, input_batches=1, 
input_rows=8192, output_batches=1, build_mem_used=853216, build_time=440.208µs, 
join_time=141.792µs] |
   |                   |     ProjectionExec: expr=[value@0 as v1], 
metrics=[output_rows=100001, elapsed_compute=6.415µs]                           
                                                                                
                                                                              |
   |                   |       LazyMemoryExec: partitions=1, 
batch_generators=[generate_series: start=0, end=100000, batch_size=8192], 
metrics=[output_rows=100001, elapsed_compute=223.997µs]                         
                                                                                
          |
   |                   |     ProjectionExec: expr=[value@0 as v1], 
metrics=[output_rows=8192, elapsed_compute=792ns]                               
                                                                                
                                                                              |
   |                   |       LazyMemoryExec: partitions=1, 
batch_generators=[generate_series: start=0, end=100000, batch_size=8192], 
metrics=[output_rows=8192, elapsed_compute=15.667µs]                            
                                                                                
          |
   |                   |                                                        
                                                                                
                                                                                
                                                                 |
   
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   1 row(s) fetched.
   Elapsed 0.004 seconds.
   ```
   
   ### Issue
   There is certain early termination applied to avoid joining till the end, 
however such early termination is done at batch level, instead of row level. 
It's possible to terminate sooner and speed up the above (1st one) workload.
   
   The global limit executor terminates the execution once the target limit is 
reached, and NLJ will accumulate `batch_size`(default 8192) rows inside before 
output, instead of stop once limit `10` is reached.
   
   ### Potential Optimization
   Ideally it can stop once `limit` is reached, and become 800x (8192 buffer 
size / 10 limit) faster in the best case.
   
   ### Describe the solution you'd like
   
   Push limit into nested loop join operator (potentially also other join 
types) , and it should terminate once the limit is reached.
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] Push limit into joins [datafusion]

Reply via email to