[GitHub] [arrow-datafusion] alamb commented on pull request #1776: Update `ExecutionPlan` to know about sortedness and repartitioning optimizer pass respect the invariants

GitBox Tue, 08 Feb 2022 12:40:13 -0800


alamb commented on pull request #1776:
URL: 
https://github.com/apache/arrow-datafusion/pull/1776#issuecomment-1033040953



   Something else I have been musing about is how to handle knowledge that the 
data is sorted only after a partition is executed.
   
   For example, let's say in some future world, that when `GroupByHash` spills 
to disk it will produce the output in sorted group key order. If this is then 
fed into a `Sort` then at runtime if the GroupByHash spills the sort could 
simply merge its input partitions rather than having to actually sort them.
   
   🤔 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] alamb commented on pull request #1776: Update `ExecutionPlan` to know about sortedness and repartitioning optimizer pass respect the invariants

Reply via email to