mustafasrepo opened a new pull request, #4839:
URL: https://github.com/apache/arrow-datafusion/pull/4839

   # Which issue does this PR close?
   
   N/A
   
   # Rationale for this change
   
   During the review of 
[#4691](https://github.com/apache/arrow-datafusion/pull/4691), one of the key 
findings was that there was a small separation of concern issue: Both 
`BasicEnforcement` and `OptimizeSorts` were dealing with local sorting and 
there was some overlap in functionality.
   
   This PR pays this technical debt: Enforcers of distribution and sorting 
requirements will henceforth be two completely orthogonal rules 
(`EnforceDistribution` and `EnforceSorting`). Note that one can get the same 
result with the old `BasicEnforcement` rule by applying these two rules in 
succession.
   
   The new `EnforceSorting` doesn't just enforce sorts by naively adding 
`SortExec`s, it will smartly add OR remove them as it enforces the ordering 
requirements. This will hopefully help with rule reuse and ease reasoning (we 
will not lose optimality by liberally using `EnforceSorting`).
   
   # What changes are included in this PR?
   
   Local sort enforcement AND optimization is handled with a single rule. 
Current rule can be called multiple times without a downside in terms of the 
final physical plan.
   
   # Are these changes tested?
   
   Existing tests check for plan correctness.
   
   # Are there any user-facing changes?
   
   No.
   
   # Future Work
   
   Some of the `EnforceDistribution` tests actually test the full 
`EnforceDistribution` + `EnforceSorting` cascade. It would be a good idea to 
orthogonalize those tests too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to