alamb commented on issue #424:
URL: 
https://github.com/apache/arrow-datafusion/issues/424#issuecomment-847981350


   My thoughts:
   
   I think it will be simpler, as @tustvold  has suggested, to do the majority 
/ all of sort based optimizations (e.g. optimize away a Sort) on the 
`LogicalPlan` level, rather than in the physical plan. That way:
   1. We can work with `Exprs` rather than `PhysicalExpr`s. 
   2. The knowledge of sort order can also feed into potential cost model 
decisions too (e.g. join ordering, algorithm selection)
   
   Encoding the requirements / assumptions of `LogicalPlan` nodes via 
`outputOrdering ` or `requiredChildOrdering` seems like a good idea to me.
   
   In terms of physical plans, what about adding something like 
`ExecutionPlan::requires_output_sort()` that would communicate to the various 
physical optimizer passes when they had to preserve the output sort (and thus 
might preclude things like "repartition exec" from rewriting the plan)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to