Adam-Alani commented on code in PR #22853:
URL: https://github.com/apache/datafusion/pull/22853#discussion_r3472690704


##########
datafusion/expr/src/higher_order_function.rs:
##########
@@ -257,26 +266,64 @@ pub struct LambdaArgument {
 }
 
 impl LambdaArgument {
+    /// Build a [`LambdaArgument`] that treats every declared parameter as
+    /// used. This is the backwards-compatible behavior. Prefer
+    /// [`Self::new_with_used_params`] when the caller knows which subset of
+    /// the lambda's parameters the body actually references — otherwise the
+    /// merged batch will still contain columns for unused parameters.
     pub fn new(
         params: Vec<FieldRef>,
         body: Arc<dyn PhysicalExpr>,
         captures: Option<RecordBatch>,
     ) -> Self {
+        Self::new_with_used_params(params, body, captures, None)
+    }
+
+    /// Build a [`LambdaArgument`] knowing which subset of `params` (by name)
+    /// the lambda body actually references.
+    ///
+    /// When `used_params` is `Some(set)`, [`Self::evaluate`] only evaluates
+    /// and pushes the closures whose corresponding parameter name is in
+    /// `set`, in the original declaration order of `params`. Unused declared
+    /// parameters leave no slot in the merged batch, so the body's compressed
+    /// column indices line up directly. When `used_params` is `None`,
+    /// behavior is identical to [`Self::new`].
+    pub fn new_with_used_params(
+        params: Vec<FieldRef>,
+        body: Arc<dyn PhysicalExpr>,
+        captures: Option<RecordBatch>,
+        used_params: Option<HashSet<String>>,
+    ) -> Self {
+        let used_param_indices = used_params.map(|set| {
+            params
+                .iter()
+                .enumerate()
+                .filter(|(_, f)| set.contains(f.name()))
+                .map(|(i, _)| i)
+                .collect::<Vec<_>>()
+        });
+
+        let effective_params: Vec<FieldRef> = match &used_param_indices {
+            Some(indices) => indices.iter().map(|i| 
Arc::clone(&params[*i])).collect(),
+            None => params.clone(),
+        };
+
         let fields = match &captures {
             Some(batch) => batch
                 .schema_ref()
                 .fields()
                 .iter()
                 .cloned()
-                .chain(params.clone())
+                .chain(effective_params.iter().cloned())

Review Comment:
   Good catch — done in the latest force-push: dropped the `.iter().cloned()` 
and chain `effective_params` directly. The `None` branch already moved the 
`Vec`, so this just makes the `Some` branch consistent.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to