ozankabak commented on PR #5171:
URL: 
https://github.com/apache/arrow-datafusion/pull/5171#issuecomment-1421168600

   > I think this is because that global sort + CoalescePartitionsExec were 
added later by the two enforcement rules.
   An easy way to get ride from this is to run the GlobalSortSelection rule 
again after the two enforcement rules. I would prefer still let the 
GlobalSortSelection rule handle this optimization. Need to be enhance 
GlobalSortSelection rule to handle the SortExec + CoalescePartitionsExec 
combination.
   
   If we end up handling this combination there, and running it twice; it 
really diminishes the value of this approach. Maybe there is a way to do it 
elegantly, I will think about it in detail. If we (or you) can figure out a way 
to do this elegantly, we can go back to this approach; but for now, it doesn't 
look too good to me.
   
   > Another approach I can think is maybe we can have a specific handling in 
EnforceDistribution rule, if the plan 's distribution requirement is 
Distribution::SinglePartition and the plan also has some sorting requirements, 
add the prefer-parallel-sort configuration is on, add SortPreservingMergeExec + 
SortExec. If the SortExec is unnecessary, it will be removed later by the 
EnforceSorting rule
   
   I think this is interesting and sounds more promising to me. I will think 
about this today, maybe we can do this in a follow-on PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to