touchida opened a new issue, #5057: URL: https://github.com/apache/kyuubi/issues/5057
### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) ### Search before asking - [X] I have searched in the [issues](https://github.com/apache/kyuubi/issues?q=is%3Aissue) and found no similar issues. ### What would you like to be improved? Currently, the Spark's `insertRepartitionBeforeWrite` optimization rule will be skipped when logical plans are `Sort` regardless of whether they are local or not: https://github.com/apache/kyuubi/blob/fa9e6be/extensions/spark/kyuubi-extension-spark-common/src/main/scala/org/apache/kyuubi/sql/RepartitionBeforeWritingBase.scala#L133. It makes sense for global sort, since inserting repartition after the sort changes the semantics of the original plans and doing before that only introduces an additional shuffle. However, inserting repartition before local sort will help to sort rebalanced partitions even if locally, and it follows the behaviors of queries that explicitly use both `REPARTITION|REBALANCE` and `SORT BY`. ### How should we improve? This issue proposes to support local sort in the Spark's `insertRepartitionBeforeWrite` optimization rule by inserting repartition before the sort. ### Are you willing to submit PR? - [X] Yes. I would be willing to submit a PR with guidance from the Kyuubi community to improve. - [ ] No. I cannot submit a PR at this time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
