ozankabak commented on PR #5171: URL: https://github.com/apache/arrow-datafusion/pull/5171#issuecomment-1421168600
> I think this is because that global sort + CoalescePartitionsExec were added later by the two enforcement rules. An easy way to get ride from this is to run the GlobalSortSelection rule again after the two enforcement rules. I would prefer still let the GlobalSortSelection rule handle this optimization. Need to be enhance GlobalSortSelection rule to handle the SortExec + CoalescePartitionsExec combination. If we end up handling this combination there, and running it twice; it really diminishes the value of this approach. Maybe there is a way to do it elegantly, I will think about it in detail. If we (or you) can figure out a way to do this elegantly, we can go back to this approach; but for now, it doesn't look too good to me. > Another approach I can think is maybe we can have a specific handling in EnforceDistribution rule, if the plan 's distribution requirement is Distribution::SinglePartition and the plan also has some sorting requirements, add the prefer-parallel-sort configuration is on, add SortPreservingMergeExec + SortExec. If the SortExec is unnecessary, it will be removed later by the EnforceSorting rule I think this is interesting and sounds more promising to me. I will think about this today, maybe we can do this in a follow-on PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org