2010YOUY01 commented on code in PR #19468:
URL: https://github.com/apache/datafusion/pull/19468#discussion_r2644617827
##########
datafusion/physical-plan/src/joins/nested_loop_join.rs:
##########
@@ -550,17 +550,22 @@ impl ExecutionPlan for NestedLoopJoinExec {
}
fn partition_statistics(&self, partition: Option<usize>) ->
Result<Statistics> {
- if partition.is_some() {
- return Ok(Statistics::new_unknown(&self.schema()));
- }
let join_columns = Vec::new();
- estimate_join_statistics(
- self.left.partition_statistics(None)?,
- self.right.partition_statistics(None)?,
+ let left_stats = self.left.partition_statistics(None)?;
+ let right_stats = match partition {
+ Some(partition) =>
self.right.partition_statistics(Some(partition))?,
+ None => self.right.partition_statistics(None)?,
+ };
+
+ let stats = estimate_join_statistics(
Review Comment:
I think we should add more comments to this function to explain how it
propagates statistics.
In particular, here we set the join columns to empty, but NLJ might have
additional join conditions (for example, `t1 join t2 on (t1.v1+t2.v1)%2=0`),
and this utility is not able to handle it yet. Is this function assuming the
input join don't have any join conditions?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]