alamb commented on code in PR #21321:
URL: https://github.com/apache/datafusion/pull/21321#discussion_r3041281684


##########
datafusion/optimizer/src/propagate_empty_relation.rs:
##########
@@ -230,6 +265,60 @@ fn empty_child(plan: &LogicalPlan) -> 
Result<Option<LogicalPlan>> {
     }
 }
 
+/// Builds a Projection that replaces one side of an outer join with NULL 
literals.
+///
+/// When one side of an outer join is an `EmptyRelation`, the join can be 
eliminated
+/// by projecting the surviving side's columns as-is and replacing the empty 
side's
+/// columns with `CAST(NULL AS <type>)`.
+///
+/// The join schema is used as the projection's output schema to preserve 
nullability
+/// guarantees (important for FULL JOIN where the surviving side's columns are 
marked
+/// nullable in the join schema even if they aren't in the source schema).
+///
+/// # Example
+///
+/// For a `LEFT JOIN` where the right side is empty:
+/// ```text
+/// Left Join (orders.id = returns.order_id)        Projection(orders.id, 
orders.amount,
+///   ├── TableScan: orders                   =>      CAST(NULL AS Int64) AS 
order_id,
+///   └── EmptyRelation                               CAST(NULL AS Utf8) AS 
reason)
+///                                                   └── TableScan: orders
+/// ```
+fn build_null_padded_projection(
+    surviving_plan: Arc<LogicalPlan>,
+    join_schema: &DFSchemaRef,
+    left_field_count: usize,
+    empty_side_is_right: bool,
+) -> Result<LogicalPlan> {
+    let exprs = join_schema
+        .iter()
+        .enumerate()
+        .map(|(i, (qualifier, field))| {
+            let on_empty_side = if empty_side_is_right {
+                i >= left_field_count
+            } else {
+                i < left_field_count
+            };
+
+            if on_empty_side {
+                Expr::Cast(Cast::new(
+                    Box::new(Expr::Literal(ScalarValue::Null, None)),
+                    field.data_type().clone(),
+                ))

Review Comment:
   You can write this more concisely using the fluent API, something like
   
   ```rust
   lit(ScalarValue::Null).cast( field.data_type().clone())
   ```



##########
datafusion/sqllogictest/test_files/subquery.slt:
##########
@@ -689,10 +689,8 @@ query TT
 explain SELECT t1_id, (SELECT t2_id FROM t2 limit 0) FROM t1
 ----
 logical_plan
-01)Projection: t1.t1_id, __scalar_sq_1.t2_id AS t2_id
-02)--Left Join: 
-03)----TableScan: t1 projection=[t1_id]
-04)----EmptyRelation: rows=0
+01)Projection: t1.t1_id, Int32(NULL) AS t2_id

Review Comment:
   👍  nice



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to