Rachelint commented on issue #15633:
URL: https://github.com/apache/datafusion/issues/15633#issuecomment-2866894705

   > Hi [@Rachelint](https://github.com/Rachelint), it seems you've been 
focusing on other topics over the past few weeks and haven’t had a chance to 
pick this up. If you don't mind, I’d like to take this on and push it to the 
finish line?
   
   Surely don't mind.
   
   And I am really sorry about long delay due to some private things, and still 
need more times for #15591 and some follow up related optimizations...
   
   -------------------------------------------
   
   Here are some experiment results I tried for this issue recently, and I hope 
it can help.
   
   It seems always better in `no grouping` case, but it seems really complex if 
introducing it in `group by` case.
   
   Generally, maybe two situations exists
   
   ## 1. No common subexpr exists after transforming
   In this situation, it may get `slower` after transforming, for example:
   ```sql
   // Origin (faster)
   SELECT aggr(a + b + c) FROM t GROUP BY d;
   
   // After converting (slower)
   SELECT aggr(a), aggr(b), aggr(c) FROM t GROUP BY d;
   ```
   It is due to: 
   - we will maintain a `Vec` for each `aggr` in `group by` case
   - and the number of `Vec` is `1 in origin`, and `3 after converting`
   
   ## 2. Common subexpr exists after transforming
   In this situation, it may can get `faster` after transforming, for example:
   ```sql
   // Origin (slower)
   SELECT aggr(a + 1), aggr(a + 2), aggr(a + 3) FROM t GROUP BY d;
   
   // After converting (faster)
   SELECT aggr(a) + aggr(1), aggr(a) + aggr(2), aggr(a) + aggr(3) FROM t GROUP 
BY d;
   ```
   It is due to most `aggr` can be eliminated by `common subexpr eliminate 
rule`. And Actually we reduce computation after transforming.
   
   So, I thought if we want to make it works generally even in `group by` 
cases, may we need to combine it with `common subexpr eliminate rule` in some 
extent.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to