alamb commented on code in PR #21321:
URL: https://github.com/apache/datafusion/pull/21321#discussion_r3041281684
##########
datafusion/optimizer/src/propagate_empty_relation.rs:
##########
@@ -230,6 +265,60 @@ fn empty_child(plan: &LogicalPlan) ->
Result<Option<LogicalPlan>> {
}
}
+/// Builds a Projection that replaces one side of an outer join with NULL
literals.
+///
+/// When one side of an outer join is an `EmptyRelation`, the join can be
eliminated
+/// by projecting the surviving side's columns as-is and replacing the empty
side's
+/// columns with `CAST(NULL AS <type>)`.
+///
+/// The join schema is used as the projection's output schema to preserve
nullability
+/// guarantees (important for FULL JOIN where the surviving side's columns are
marked
+/// nullable in the join schema even if they aren't in the source schema).
+///
+/// # Example
+///
+/// For a `LEFT JOIN` where the right side is empty:
+/// ```text
+/// Left Join (orders.id = returns.order_id) Projection(orders.id,
orders.amount,
+/// ├── TableScan: orders => CAST(NULL AS Int64) AS
order_id,
+/// └── EmptyRelation CAST(NULL AS Utf8) AS
reason)
+/// └── TableScan: orders
+/// ```
+fn build_null_padded_projection(
+ surviving_plan: Arc<LogicalPlan>,
+ join_schema: &DFSchemaRef,
+ left_field_count: usize,
+ empty_side_is_right: bool,
+) -> Result<LogicalPlan> {
+ let exprs = join_schema
+ .iter()
+ .enumerate()
+ .map(|(i, (qualifier, field))| {
+ let on_empty_side = if empty_side_is_right {
+ i >= left_field_count
+ } else {
+ i < left_field_count
+ };
+
+ if on_empty_side {
+ Expr::Cast(Cast::new(
+ Box::new(Expr::Literal(ScalarValue::Null, None)),
+ field.data_type().clone(),
+ ))
Review Comment:
You can write this more concisely using the fluent API, something like
```rust
lit(ScalarValue::Null).cast( field.data_type().clone())
```
##########
datafusion/sqllogictest/test_files/subquery.slt:
##########
@@ -689,10 +689,8 @@ query TT
explain SELECT t1_id, (SELECT t2_id FROM t2 limit 0) FROM t1
----
logical_plan
-01)Projection: t1.t1_id, __scalar_sq_1.t2_id AS t2_id
-02)--Left Join:
-03)----TableScan: t1 projection=[t1_id]
-04)----EmptyRelation: rows=0
+01)Projection: t1.t1_id, Int32(NULL) AS t2_id
Review Comment:
👍 nice
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]