erratic-pattern commented on PR #10358:
URL: https://github.com/apache/datafusion/pull/10358#issuecomment-2095040396

   I've made a new algorithm for this that should in theory reduce the amount 
of work needed to be done by short-circuiting earlier once there is a 
consecutive sequence of unchanged expression trees equal to the number of 
rewriters. However, it ended up being harder than I thought to get actual 
performance improvements when comparing with local benchmarks. I tried several 
different approaches, but most were actually slower than the code that I have 
here in this PR.
   
   I did eventually come up with something that could possibly compete with the 
simpler algorithm in this PR. You can compare the branch 
[here](https://github.com/apache/datafusion/compare/main...erratic-pattern:adam/loop-expr-simplifier-static-dispatch?expand=1)
   
   My guess is that the additional overhead of checking each iteration is 
actual significant when we are only running 3 rewriters. I think we would see 
bigger improvements in cases where the number of optimization rules is larger. 
Many of the approaches I tried required dynamic dispatch on the 
`TreeNodeRewriter`s which was surprisingly a larger cost that I expected. The 
approach I ended up with avoids the dynamic dispatch which seems to be the main 
reason it's faster.
   
   I would be interested in scaling this up to run across all of the 
`OptimizationRules` where we should see a bigger improvement. However, I don't 
think everything has fully migrated from `try_optimize` so we would have to 
wait for that, I think. Once that happens, it should be possible to generalize 
the new branch code to work with both `OptimizationRule`s as well as 
`TreeNodeRewriter`s . 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to