mustafasrepo opened a new pull request, #5661: URL: https://github.com/apache/arrow-datafusion/pull/5661
# Which issue does this PR close? N/A # Rationale for this change As some of you may know, we have been (and still are) collaborating with @mingmwang on a top-down refactoring of the `EnforceSorting` [rule](https://github.com/apache/arrow-datafusion/pull/5290). During our quest to find the simplest implementation that still offers full functionality, we observed that bottom-up and top-down approaches have their pros and cons. A bottom-up approach is better avoiding pipeline breakages, while a top-down approach results in simpler code and is able to perform certain push down operations its bottom-up sibling can not. It is still not entirely clear whether we can achieve full functionality in a pure top-down approach, but in the interim, we can combine the two approaches to offer full functionality AND implement a test suite to guide our future efforts. # What changes are included in this PR? This PR implements a hybrid top-down/bottom-up sort optimization rule by utilizing the existing bottom-up approach and the top-down approach implemented by @mingmwang . With this hybrid rule, all test plans either improve or stay the same. We are also checking in a much larger test suite verifying all sorts of improvements like union pushdowns etc. We think the new test suite will serve as a very useful baseline for future improvements on the `EnforceSorting` rule (including our ongoing collaboration on finding whether a pure top-down approach is possible). # Are these changes tested? New unit tests are added. Existing tests also pass. # Are there any user-facing changes? No. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
