[GitHub] [arrow-datafusion] alex-natzka commented on a diff in pull request #3861: Fix 3635 redundant projections

GitBox Wed, 19 Oct 2022 01:53:04 -0700


alex-natzka commented on code in PR #3861:
URL: https://github.com/apache/arrow-datafusion/pull/3861#discussion_r999144048



##########
benchmarks/expected-plans/q19.txt:
##########
@@ -1,9 +1,8 @@
 Projection: SUM(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount) AS 
revenue
   Aggregate: groupBy=[[]], aggr=[[SUM(CAST(lineitem.l_extendedprice AS 
Decimal128(38, 4)) * CAST(Decimal128(Some(100),23,2) - CAST(lineitem.l_discount 
AS Decimal128(23, 2)) AS Decimal128(38, 4))) AS SUM(lineitem.l_extendedprice * 
Int64(1) - lineitem.l_discount)]]
-    Projection: lineitem.l_extendedprice, lineitem.l_discount

Review Comment:
   Hm, I don't think the hash aggregation takes columns into account which are 
neither in the `groupBy` nor in the `aggr` expressions. Even if it did, it 
probably shouldn't be the job of `CommonSubexprEliminate` to ensure that 
redundant columns are projected out first, that should be it's own (physical?) 
optimization rule IMO.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] alex-natzka commented on a diff in pull request #3861: Fix 3635 redundant projections

Reply via email to