2010YOUY01 commented on code in PR #20228:
URL: https://github.com/apache/datafusion/pull/20228#discussion_r2781796159


##########
datafusion/physical-plan/src/joins/hash_join/exec.rs:
##########
@@ -1476,12 +1506,53 @@ impl ExecutionPlan for HashJoinExec {
                         filter: dynamic_filter,
                         build_accumulator: OnceLock::new(),
                     }),
+                    fetch: self.fetch,
                 });
                 result = result.with_updated_node(new_node as Arc<dyn 
ExecutionPlan>);
             }
         }
         Ok(result)
     }
+
+    fn supports_limit_pushdown(&self) -> bool {
+        // Hash join execution plan does not support pushing limit down 
through to children
+        // because the children don't know about the join condition and can't
+        // determine how many rows to produce
+        false
+    }
+
+    fn fetch(&self) -> Option<usize> {
+        self.fetch
+    }
+
+    fn with_fetch(&self, limit: Option<usize>) -> Option<Arc<dyn 
ExecutionPlan>> {
+        // Null-aware anti join requires seeing ALL probe rows to check for 
NULLs.
+        // If any probe row has NULL, the output must be empty.
+        // We can't stop early or we might miss a NULL and return wrong 
results.
+        if self.null_aware {

Review Comment:
   I don't understand this part.
   
   The `output_buffer` will only get filled when an output entry is finalized, 
so this should be handled automatically?



##########
datafusion/physical-plan/src/joins/hash_join/exec.rs:
##########
@@ -258,6 +258,8 @@ pub struct HashJoinExecBuilder {
     partition_mode: PartitionMode,
     null_equality: NullEquality,
     null_aware: bool,
+    /// Maximum number of rows to return

Review Comment:
   ```suggestion
       /// Maximum number of rows to return
       ///
       /// If the operator produces `< fetch` rows, it returns all available 
rows.
       /// If it produces `>= fetch` rows, it returns exactly `fetch` rows and 
stops early.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to