berkaysynnada commented on issue #9111:
URL: 
https://github.com/apache/arrow-datafusion/issues/9111#issuecomment-1926347736

   > I have a two questions:
   > 
   > 1. Do you know of any examples of "algorithmic limitations"  (e.g. plans 
where unnecessary columns are carried through)?
   > 2. How does this compare to how pushdown is done during LogicalPlanning 
(e.g.
   >    
https://github.com/apache/arrow-datafusion/blob/main/datafusion/optimizer/src/push_down_projection.rs).
 Are you planning changes / extension to that logic too? One of the reason to 
do pushdown at that level is that it is less complicated (e.g. output indexes 
aren't used)
    
   1. What I meant by algorithmic limitations was about the current rule. The 
main problem occurs in such a situation:
   `ProjectionA <- OperatorA <- OperatorB`
   Pushdown mechanism tries to push the projection between `OperatorA` and 
`OperatorB`. When it cannot pushdown it, the rule gives up. However, there 
might be some cases which the best solution shows up like this:
   `ProjectionB <- OperatorA <- ProjectionC <- OperatorB`
   
   Another problem occurs when a projection is inserted into the input of some 
operator to decrease the load, however; that projection insertion may cause to 
change the indices of some columns at the output operators. In such a case, all 
operators object to a column index change must be rewritten.
   I would like to eliminate those kind of restrictions by extending the 
capabilities of the current rule, and I believe pushing down mechanism falls 
behind it.
   
   2. I haven't deeply examined the rule in the logical plan, but I know the 
rule there is less complicated. However, after all the rules have been worked 
in a physical plan, it would be comforting for such a rule to double check the 
plan.
   
   > Maybe you could add an API like this:
   
   The new version API requirements are a little different. I re-request your 
opinions when the PR is ready.
   Thanks for your evaluation and suggestions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to