mingmwang commented on PR #5171: URL: https://github.com/apache/arrow-datafusion/pull/5171#issuecomment-1420632377
> > Yes, please take a look at the `GlobalSortSelection` rule. This rule does not need to care about the position of `CoalescePartitionsExec` because `CoalescePartitionsExec`s are added by `EnforceDistribution` rule which is triggered after the `GlobalSortSelection` rule. The physical Sort Selection should happen in a very early stage of the physical optimization phase. I guess why the current `GlobalSortSelection` does not optimize all the `Global Sort` is because it is not that aggressive and has an additional check. If you comment that check, all the `Global Sort` should be replaced. > > `&& sort_exec.fetch().is_some()` > > @mingmwang your suggestion works. This greatly simplifies the code. Thanks for the suggestion. By the way I didn't remove `&& sort_exec.fetch().is_some()` check directly. I `OR`ed this check with the config option (can be found [here](https://github.com/synnada-ai/arrow-datafusion/blob/6db42485b016ef0acb183cb70e391bfc6910d4f9/datafusion/core/src/physical_optimizer/global_sort_selection.rs#L59)). In case, one wants to toggle this feature. As you, and @alamb say in some contexts this may not be what users want. @mustafasrepo My suggestion is just removing this fetch check and use the config option you added in this PR to turn on/off the optimization. @alamb How do you think ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
