mustafasrepo opened a new pull request, #4839: URL: https://github.com/apache/arrow-datafusion/pull/4839
# Which issue does this PR close? N/A # Rationale for this change During the review of [#4691](https://github.com/apache/arrow-datafusion/pull/4691), one of the key findings was that there was a small separation of concern issue: Both `BasicEnforcement` and `OptimizeSorts` were dealing with local sorting and there was some overlap in functionality. This PR pays this technical debt: Enforcers of distribution and sorting requirements will henceforth be two completely orthogonal rules (`EnforceDistribution` and `EnforceSorting`). Note that one can get the same result with the old `BasicEnforcement` rule by applying these two rules in succession. The new `EnforceSorting` doesn't just enforce sorts by naively adding `SortExec`s, it will smartly add OR remove them as it enforces the ordering requirements. This will hopefully help with rule reuse and ease reasoning (we will not lose optimality by liberally using `EnforceSorting`). # What changes are included in this PR? Local sort enforcement AND optimization is handled with a single rule. Current rule can be called multiple times without a downside in terms of the final physical plan. # Are these changes tested? Existing tests check for plan correctness. # Are there any user-facing changes? No. # Future Work Some of the `EnforceDistribution` tests actually test the full `EnforceDistribution` + `EnforceSorting` cascade. It would be a good idea to orthogonalize those tests too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
