alamb commented on code in PR #9991:
URL: https://github.com/apache/arrow-datafusion/pull/9991#discussion_r1559893156
##########
datafusion/sqllogictest/test_files/group_by.slt:
##########
@@ -2129,41 +2129,11 @@ query III
1 2 1550
1 3 2175
-
-# test_source_sorted_groupby2
-# If ordering is not important for the aggregation function, we should ignore
the ordering requirement. Hence
-# "ORDER BY a DESC" should have no effect.
-query TT
-EXPLAIN SELECT a, d,
- SUM(c ORDER BY a DESC) as summation1
- FROM annotated_data_infinite2
- GROUP BY d, a
-----
-logical_plan
-Projection: annotated_data_infinite2.a, annotated_data_infinite2.d,
SUM(annotated_data_infinite2.c) ORDER BY [annotated_data_infinite2.a DESC NULLS
FIRST] AS summation1
---Aggregate: groupBy=[[annotated_data_infinite2.d,
annotated_data_infinite2.a]], aggr=[[SUM(CAST(annotated_data_infinite2.c AS
Int64)) ORDER BY [annotated_data_infinite2.a DESC NULLS FIRST]]]
-----TableScan: annotated_data_infinite2 projection=[a, c, d]
-physical_plan
-ProjectionExec: expr=[a@1 as a, d@0 as d, SUM(annotated_data_infinite2.c)
ORDER BY [annotated_data_infinite2.a DESC NULLS FIRST]@2 as summation1]
---AggregateExec: mode=Single, gby=[d@2 as d, a@0 as a],
aggr=[SUM(annotated_data_infinite2.c)], ordering_mode=PartiallySorted([1])
-----StreamingTableExec: partition_sizes=1, projection=[a, c, d],
infinite_source=true, output_ordering=[a@0 ASC NULLS LAST]
-
-query III
+statement error DataFusion error: This feature is not implemented: ORDER BY is
not implemented for SUM
Review Comment:
I hadn't thought about the implications for `SUM(ORDER BY ...)` for floats 🤔
It appears that this is consistent with what postgres does:
```sql
postgres=# create table foo (x float);
CREATE TABLE
postgres=# insert into foo values (1.0);
INSERT 0 1
postgres=# insert into foo values (2.0);
INSERT 0 1
postgres=# insert into foo values (-1.0);
INSERT 0 1
postgres=# select sum(x ORDER BY x) from foo;
sum
-----
2
(1 row)
postgres=# select sum(x IGNORE NULLS) from foo;
ERROR: syntax error at or near "IGNORE"
LINE 1: select sum(x IGNORE NULLS) from foo;
^
postgres=#
```
I also verified that postgres actually does sort the input:
```sql
postgres=# explain select sum(x ORDER BY x) from foo;
QUERY PLAN
-------------------------------------------------------------------
Aggregate (cost=169.81..169.82 rows=1 width=8)
-> Sort (cost=158.51..164.16 rows=2260 width=8)
Sort Key: x
-> Seq Scan on foo (cost=0.00..32.60 rows=2260 width=8)
(4 rows)
postgres=#
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]