avantgardnerio commented on issue #23194:
URL: https://github.com/apache/datafusion/issues/23194#issuecomment-4844950636

   Hi @gabotechs ,  I think your list of concerns is really helpful. I'm 
usually working with Claude on things, and he provided this helpful mapping of 
each of them to what's in the PR:
   
   <img width="1792" height="400" alt="Image" 
src="https://github.com/user-attachments/assets/d2d65ead-7ee8-48eb-906b-b73cb0cfb593";
 />
   
   I think that makes the picture clearer than it's been previously. I've not 
been at it long, but I think AQE typically gets some baggage associated with it:
   
   1. "AQE is only for batch processes" - I think your work is really 
innovative and shows this is untrue
   2. "AQE is a distribution concern" - I think Andy's work on 
Datafusion/Ballista (and this PR) show what is good for the distributed goose 
is also generally good for the local gander. 
   
   I like what you said about idempotent rules being a keystone, and I see 
folks have been putting a lot of effort into making them so in upstream, in 
order to benefit downstream repos. This puts Datafusion in an interesting 
position of having to maintain the code (and the invariant), but not having it 
easily testable or reap any direct performance benefits. (yes, there can be 
unit tests, but contributors must know to write and maintain them and why).
   
   What this PR hopes to offer the downstream community is:
   
   1. An operational whitelist of AQE ready optimizer rules, built gradually, 
over time (idempotent ones might be ready to go!)
   2. operators that are AQE compatible
   3. _and most importantly in-repo benefit_ so Datafusion contributors have 
incentive to test & maintain them (performance gains)
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to