irenjj commented on PR #15993:
URL: https://github.com/apache/datafusion/pull/15993#issuecomment-2865547053

   > Hi @irenjj. Thank you working on this, but did you try to set that config 
to false and run the tests?
   > 
   > I think people should see here how the plans are changed if we do so 
(perhaps you can open another PR for that -- I don't know if there is an easier 
way)
   > 
   > As I said in the discussion, I afraid our physical rules are not ready for 
this. we might either miss some sort eliminations, or possibly eliminate some 
subquery sorts even if config says opposite. Therefore; I was suggesting 
enabling this option from lower level to higher. If people still consent this, 
I couldn't say more but we should add some warnings at least.
   
   Yep, I've tried to set config to false and it can forbid eliminating sort in 
subquery. I'll add a test.
   
   >If I approach this case practically, when "order by" clauses are given in 
subqueries: these are converted into SortExecs at somewhere in the plan. 
However, in enforce_sorting, we don't track ordering requirements through 
SortExecs directly (otherwise, we wouldn't be able to eliminate truly necessary 
SortExecs). Instead, we track the requirement by inserting 
OutputRequirementExec at the top of the plan, which corresponds to the global 
ordering - that is the ordering expected when an explicit ORDER BY is given in 
the outermost query.
   > 
   > If we decide to introduce a new config for this setting, we first need to 
improve the optimizer phase. Specifically, OutputRequirementExecs could be 
inserted into intermediate nodes in the plan, meaning the subplan beneath must 
guarantee to bring the required ordering at that point.
   So, TLDR, unless we adapt the current enforce_sorting rule accordingly, we 
risk losing the ability to eliminate unnecessary SortExecs in some cases.
   
   This is a very good approach. We can implement it later to replace the 
current approach of adding configuration


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to