ozankabak commented on code in PR #6482:
URL: https://github.com/apache/arrow-datafusion/pull/6482#discussion_r1214472005


##########
datafusion/core/tests/sqllogictests/test_files/groupby.slt:
##########
@@ -2384,3 +2384,197 @@ SELECT s.country, ARRAY_AGG(s.amount ORDER BY s.country 
DESC, s.amount DESC) AS
 FRA [200.0, 50.0] 250
 GRC [80.0, 30.0] 110
 TUR [100.0, 75.0] 175
+
+# test_reverse_aggregate_expr
+# Some of the Aggregators can be reversed, by this way we can still run 
aggregators
+# that have contradictory requirements at first glance.
+query TT
+EXPLAIN SELECT country, ARRAY_AGG(amount ORDER BY amount DESC) AS amounts,
+  FIRST_VALUE(amount ORDER BY amount ASC) AS fv1,
+  LAST_VALUE(amount ORDER BY amount DESC) AS fv2
+  FROM sales_global
+  GROUP BY country
+----
+logical_plan
+Projection: sales_global.country, ARRAY_AGG(sales_global.amount) ORDER BY 
[sales_global.amount DESC NULLS FIRST] AS amounts, 
FIRST_VALUE(sales_global.amount) ORDER BY [sales_global.amount ASC NULLS LAST] 
AS fv1, LAST_VALUE(sales_global.amount) ORDER BY [sales_global.amount DESC 
NULLS FIRST] AS fv2
+--Aggregate: groupBy=[[sales_global.country]], 
aggr=[[ARRAY_AGG(sales_global.amount) ORDER BY [sales_global.amount DESC NULLS 
FIRST], FIRST_VALUE(sales_global.amount) ORDER BY [sales_global.amount ASC 
NULLS LAST], LAST_VALUE(sales_global.amount) ORDER BY [sales_global.amount DESC 
NULLS FIRST]]]
+----TableScan: sales_global projection=[country, amount]
+physical_plan
+ProjectionExec: expr=[country@0 as country, ARRAY_AGG(sales_global.amount) 
ORDER BY [sales_global.amount DESC NULLS FIRST]@1 as amounts, 
FIRST_VALUE(sales_global.amount) ORDER BY [sales_global.amount ASC NULLS 
LAST]@2 as fv1, LAST_VALUE(sales_global.amount) ORDER BY [sales_global.amount 
DESC NULLS FIRST]@3 as fv2]
+--AggregateExec: mode=Single, gby=[country@0 as country], 
aggr=[ARRAY_AGG(sales_global.amount), FIRST_VALUE(sales_global.amount), 
LAST_VALUE(sales_global.amount)]

Review Comment:
   So we have two options:
   1. Simple string replacement of `FIRST_VALUE` to `LAST_VALUE` (and vice 
versa) and order direction when reversal happens.
   2. Generating name with the new mechanism @mustafasrepo experimented with.
   
   The advantage of (1) is that it preserves a uniform display format. Its 
disadvantage is that this is a specialized solution to 
`FIRST_VALUE`/`LAST_VALUE` and may not generalize to other functions in the 
future.
   
   The advantage of (2) is that is generalizes, but it has the drawback that we 
end up with an intermixed format; i.e. we have entries like 
`ARRAY_AGG(sales_global.amount)` along with entries like `LAST_VALUE(amount@1)`.
   
   Let's choose one for now and create an issue to fix this in a more general 
fashion in the future. I do not have a preference at this point -- @alamb, 
which one do you prefer?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to