jayzhan211 commented on code in PR #14232: URL: https://github.com/apache/datafusion/pull/14232#discussion_r1929924628
########## datafusion/functions-aggregate/src/first_last.rs: ########## @@ -701,9 +713,98 @@ fn convert_to_sort_cols(arrs: &[ArrayRef], sort_exprs: &LexOrdering) -> Vec<Sort #[cfg(test)] mod tests { use arrow::array::Int64Array; + use arrow_schema::Schema; + use compute::SortOptions; + use datafusion_physical_expr::{expressions::col, PhysicalSortExpr}; use super::*; + #[test] + fn test_last_value_with_order_bys() -> Result<()> { + // TODO: Move this kind of test to slt, we don't have a nice way to define the batch size for each `update_batch` Review Comment: What I want is multiple batches that goes through `update_batch` and `merge_batch`. ``` statement count 0 set datafusion.execution.batch_size = 2; statement count 0 create table t(a int, b int) as values (1, 1), (2, 1), (null, 1), (3, 1), (1, 1), (2, 1), (null, 1), (3, 1); query I select last_value(a order by b) from t; ---- 1 query TT explain select last_value(a order by b) from t; ---- logical_plan 01)Aggregate: groupBy=[[]], aggr=[[last_value(t.a) ORDER BY [t.b ASC NULLS LAST]]] 02)--TableScan: t projection=[a, b] physical_plan 01)AggregateExec: mode=Final, gby=[], aggr=[last_value(t.a) ORDER BY [t.b ASC NULLS LAST]] 02)--CoalescePartitionsExec 03)----AggregateExec: mode=Partial, gby=[], aggr=[last_value(t.a) ORDER BY [t.b ASC NULLS LAST]] 04)------RepartitionExec: partitioning=RoundRobinBatch(4), input_partitions=1 05)--------MemoryExec: partitions=1, partition_sizes=[1] ``` Given that the `MemoryExec` is single partition, so the data goes to single batch. Even we have 4 partitions, `update_batch` is only called once. No trivial way to test multiple `update_batch` calls with different batch. `Insert into Table ...` is the same -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org