ozankabak commented on PR #5171:
URL: 
https://github.com/apache/arrow-datafusion/pull/5171#issuecomment-1420248052

   > @mustafasrepo @ozankabak Regarding the rule applying ordering, since 
DataFusion optimization framework is still a traditional heuristic style 
framework, the rule applying orders always matter, we can not assume one rule 
can work independently without the others.
   > 
   > Specifically , `EnforceDistribution` rule is responsible for handling the 
global distribution requirements. And `EnforceSorting` rule is responsible for 
handling the local sort requirements. It's also responsible for removing 
unnecessary global sort and local sort. The global distribution requirements 
need to be handled first, after that we can handle the local 
sort(inner-partition) requirements.
   > 
   > Global properties vs Local properties 
http://www.cs.albany.edu/~jhh/courses/readings/zhou10.pdf
   
   I agree that fixing partitioning (global) and then sorting (local) is the 
more intuitive order, but this does not seem strictly necessary to me in 
theory. I can imagine changing global properties while still preserving the 
previous local properties for every partition (in the new plan). I think such a 
behavior would make rules very robust and easy to reason with. The current PR 
is not really about this anyway, but that's my general line of thinking when we 
refer to orthogonality.
   
   Nevertheless, maybe you are aware of a fundamental issue (that I am not 
foreseeing right now) which makes this impossible. If that is the case, then we 
will go with the current status quo, of course.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to