nuno-faria commented on code in PR #19619:
URL: https://github.com/apache/datafusion/pull/19619#discussion_r2658870648
##########
docs/source/library-user-guide/functions/adding-udfs.md:
##########
@@ -1350,6 +1350,71 @@ async fn main() -> Result<()> {
[`create_udaf`]:
https://docs.rs/datafusion/latest/datafusion/logical_expr/fn.create_udaf.html
[`advanced_udaf.rs`]:
https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/udf/advanced_udaf.rs
+### Window Frame Compatible Accumulators
Review Comment:
This also doesn't fit in this PR.
##########
datafusion/functions-aggregate/src/percentile_cont.rs:
##########
@@ -533,14 +534,57 @@ impl<T: ArrowNumericType> Accumulator for
PercentileContAccumulator<T> {
}
fn evaluate(&mut self) -> Result<ScalarValue> {
- let d = std::mem::take(&mut self.all_values);
- let value = calculate_percentile::<T>(d, self.percentile);
+ let value = calculate_percentile::<T>(&mut self.all_values,
self.percentile);
ScalarValue::new_primitive::<T>(value, &self.data_type)
}
fn size(&self) -> usize {
size_of_val(self) + self.all_values.capacity() * size_of::<T::Native>()
}
+
+ fn retract_batch(&mut self, values: &[ArrayRef]) -> Result<()> {
Review Comment:
I'm not sure how this relates to this PR.
##########
datafusion/sqllogictest/test_files/aggregate.slt:
##########
@@ -8241,3 +8241,137 @@ NULL NULL NULL NULL
statement ok
drop table distinct_avg;
+
Review Comment:
Also not sure why new aggregate tests are required.
##########
datafusion/physical-plan/src/filter.rs:
##########
@@ -126,6 +126,64 @@ impl FilterExec {
}
}
+ /// Create a FilterExec with projection applied during construction.
+ ///
+ /// This method is more efficient than calling
`try_new().with_projection()`
+ /// because it computes properties only once, avoiding redundant work.
+ ///
+ /// # Arguments
+ ///
+ /// * `predicate` - Boolean expression to filter rows
+ /// * `input` - Input execution plan
+ /// * `projection` - Optional column indices to project after filtering
+ ///
+ /// # Example
+ ///
+ /// ```ignore
+ /// // Create a filter that selects rows where column 0 > 5 and projects
columns [1, 3]
+ /// let filter = FilterExec::try_new_with_projection(
+ /// predicate,
+ /// input_plan,
+ /// Some(vec![1, 3])
+ /// )?;
+ /// ```
+ #[expect(clippy::needless_pass_by_value)]
+ pub fn try_new_with_projection(
Review Comment:
@adriangb suggested to add a builder instead of a new constructor and I tend
to agree. @GaneshPatil7517 could you create a `FilterExecBuilder`, with:
- a `new` function which would take the predicate and input, since they are
mandatory.
- several `with_...` methods for the projection, default_selectivity,
batch_size, and fetch.
- a `build` method which would return the `FilterExec`.
We then might even consider to have the original `try_new` use the builder
instead, to reduce the amount of duplicate code.
##########
docs/source/library-user-guide/filterexec-with-projection.md:
##########
@@ -0,0 +1,147 @@
+<!---
Review Comment:
I don't think a new user guide is needed just for this change.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]