alamb commented on code in PR #19854:
URL: https://github.com/apache/datafusion/pull/19854#discussion_r2699967279


##########
datafusion/physical-plan/src/filter.rs:
##########
@@ -139,47 +224,42 @@ impl FilterExec {
     }
 
     /// Return new instance of [FilterExec] with the given projection.
+    ///
+    /// # Deprecated
+    /// Use [`FilterExecBuilder::with_projection`] instead
+    #[deprecated(
+        since = "52.0.0",
+        note = "Use FilterExecBuilder::with_projection instead"
+    )]
     pub fn with_projection(&self, projection: Option<Vec<usize>>) -> 
Result<Self> {
-        //  Check if the projection is valid
+        // Check if the projection is valid against current output schema
         can_project(&self.schema(), projection.as_ref())?;

Review Comment:
   I wonder if the checks in can_project should be in the FilterExecBuilder? 



##########
datafusion/physical-plan/src/filter.rs:
##########
@@ -92,39 +92,124 @@ pub struct FilterExec {
     fetch: Option<usize>,
 }
 
+/// Builder for [`FilterExec`] to set optional parameters
+pub struct FilterExecBuilder {
+    predicate: Arc<dyn PhysicalExpr>,
+    input: Arc<dyn ExecutionPlan>,
+    projection: Option<Vec<usize>>,
+    default_selectivity: u8,
+    batch_size: usize,
+    fetch: Option<usize>,
+}
+
+impl FilterExecBuilder {
+    /// Create a new builder with required parameters (predicate and input)
+    pub fn new(predicate: Arc<dyn PhysicalExpr>, input: Arc<dyn 
ExecutionPlan>) -> Self {
+        Self {
+            predicate,
+            input,
+            projection: None,
+            default_selectivity: FILTER_EXEC_DEFAULT_SELECTIVITY,
+            batch_size: FILTER_EXEC_DEFAULT_BATCH_SIZE,
+            fetch: None,
+        }
+    }
+
+    /// Set the projection, composing with any existing projection.
+    ///
+    /// If a projection is already set, the new projection indices are mapped
+    /// through the existing projection. For example, if the current projection
+    /// is `[0, 2, 3]` and `with_projection(Some(vec![0, 2]))` is called, the
+    /// resulting projection will be `[0, 3]` (indices 0 and 2 of `[0, 2, 3]`).
+    ///
+    /// If no projection is currently set, the new projection is used directly.
+    /// If `None` is passed, the projection is cleared.
+    pub fn with_projection(mut self, projection: Option<Vec<usize>>) -> Self {

Review Comment:
   Given this follow on projection behavior, I wonder if calling this 
`apply_projection` or `with_additional_projection` would make that cleaer. I 
missed it first time through until I was looking at the tests



##########
datafusion/physical-plan/src/filter.rs:
##########
@@ -92,39 +92,124 @@ pub struct FilterExec {
     fetch: Option<usize>,
 }
 
+/// Builder for [`FilterExec`] to set optional parameters
+pub struct FilterExecBuilder {
+    predicate: Arc<dyn PhysicalExpr>,
+    input: Arc<dyn ExecutionPlan>,
+    projection: Option<Vec<usize>>,
+    default_selectivity: u8,
+    batch_size: usize,

Review Comment:
   Since it would be so easy to overlook, I recommend either
   
   1. making `batch_size: Option<usze>` and then throwing an internal error if 
it is not set
   2. require `batch_size` in the constructor



##########
docs/source/library-user-guide/upgrading.md:
##########
@@ -118,6 +118,42 @@ let context = SimplifyContext::default()
 
 See [`SimplifyContext` 
documentation](https://docs.rs/datafusion-expr/latest/datafusion_expr/simplify/struct.SimplifyContext.html)
 for more details.
 
+### `FilterExec` builder methods deprecated
+
+The following methods on `FilterExec` have been deprecated in favor of using 
`FilterExecBuilder`:
+
+- `with_projection()`
+- `with_batch_size()`
+
+**Who is affected:**
+
+- Users who create `FilterExec` instances and use these methods to configure 
them
+
+**Migration guide:**
+
+Use `FilterExecBuilder` instead of chaining method calls on `FilterExec`:
+
+**Before:**
+
+```rust,ignore
+let filter = FilterExec::try_new(predicate, input)?
+    .with_projection(Some(vec![0, 2]))?
+    .with_batch_size(8192)?;
+```
+
+**After:**
+
+```rust,ignore
+let filter = FilterExecBuilder::new(predicate, input)
+    .with_projection(Some(vec![0, 2]))
+    .with_batch_size(8192)
+    .build()?;
+```
+
+The builder pattern is more efficient as it computes properties once during 
`build()` rather than recomputing them for each method call.
+
+Note: `with_default_selectivity()` is not deprecated as it simply updates a 
field value and does not require the overhead of the builder pattern.

Review Comment:
   I am not sure this is needed (it seems unecessary). But it is also fine to 
leave it in



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to