[GitHub] [arrow-datafusion] Dandandan commented on issue #438: Removing the physical optimizer `AddMergeExec` breaks the sorts

GitBox Sat, 29 May 2021 01:30:55 -0700


Dandandan commented on issue #438:
URL: 
https://github.com/apache/arrow-datafusion/issues/438#issuecomment-850795291



   Yes, agreed 👍 I think hash aggregates might have the same problem.
   
   Back when I added the physical optimizers I added this rule to keep it 
functioning the same way, as this was contained in the optimization pass.
   I believe there were some comments back then too that we should change it at 
some time.
   
   I think we have a couple of options:
   
   * Add a `MergeExec` during planning time, wrap any input plan for a node 
that requires a single partition with `MergeExec` (to me it seems like the best 
option).
   * Use `MergeExec` inside the `SortExec` - this is something done in the 
`HashJoinExec` and `CrossJoinExec` too. I am not sure if that's really elegant 
as it "hides" nodes from the plan tree - so maybe that should move to the 
planner too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] Dandandan commented on issue #438: Removing the physical optimizer `AddMergeExec` breaks the sorts

Reply via email to