ozankabak commented on code in PR #9991:
URL: https://github.com/apache/arrow-datafusion/pull/9991#discussion_r1559844325


##########
datafusion/sqllogictest/test_files/group_by.slt:
##########
@@ -2129,41 +2129,11 @@ query III
 1 2 1550
 1 3 2175
 
-
-# test_source_sorted_groupby2
-# If ordering is not important for the aggregation function, we should ignore 
the ordering requirement. Hence
-# "ORDER BY a DESC" should have no effect.
-query TT
-EXPLAIN SELECT a, d,
- SUM(c ORDER BY a DESC) as summation1
- FROM annotated_data_infinite2
- GROUP BY d, a
-----
-logical_plan
-Projection: annotated_data_infinite2.a, annotated_data_infinite2.d, 
SUM(annotated_data_infinite2.c) ORDER BY [annotated_data_infinite2.a DESC NULLS 
FIRST] AS summation1
---Aggregate: groupBy=[[annotated_data_infinite2.d, 
annotated_data_infinite2.a]], aggr=[[SUM(CAST(annotated_data_infinite2.c AS 
Int64)) ORDER BY [annotated_data_infinite2.a DESC NULLS FIRST]]]
-----TableScan: annotated_data_infinite2 projection=[a, c, d]
-physical_plan
-ProjectionExec: expr=[a@1 as a, d@0 as d, SUM(annotated_data_infinite2.c) 
ORDER BY [annotated_data_infinite2.a DESC NULLS FIRST]@2 as summation1]
---AggregateExec: mode=Single, gby=[d@2 as d, a@0 as a], 
aggr=[SUM(annotated_data_infinite2.c)], ordering_mode=PartiallySorted([1])
-----StreamingTableExec: partition_sizes=1, projection=[a, c, d], 
infinite_source=true, output_ordering=[a@0 ASC NULLS LAST]
-
-query III
+statement error DataFusion error: This feature is not implemented: ORDER BY is 
not implemented for SUM

Review Comment:
   A first principles viewpoint suggests this should depend on the data type. 
For integral types, `ORDER BY` wouldn't matter for `SUM`, but it shouldn't be 
an error to still specify it -- the optimizer should just remove it. For 
floating-point types, the summation order actually makes a difference in the 
result. Any data type for which addition doesn't commute, `ORDER BY` will have 
an impact on the result.
   
   However, I'm not sure if SQL standard says anything definitive in this 
matter. If not, it would be prudent to follow the results of this 
first-principles analysis.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to