berkaysynnada commented on issue #9111: URL: https://github.com/apache/arrow-datafusion/issues/9111#issuecomment-1926347736
> I have a two questions: > > 1. Do you know of any examples of "algorithmic limitations" (e.g. plans where unnecessary columns are carried through)? > 2. How does this compare to how pushdown is done during LogicalPlanning (e.g. > https://github.com/apache/arrow-datafusion/blob/main/datafusion/optimizer/src/push_down_projection.rs). Are you planning changes / extension to that logic too? One of the reason to do pushdown at that level is that it is less complicated (e.g. output indexes aren't used) 1. What I meant by algorithmic limitations was about the current rule. The main problem occurs in such a situation: `ProjectionA <- OperatorA <- OperatorB` Pushdown mechanism tries to push the projection between `OperatorA` and `OperatorB`. When it cannot pushdown it, the rule gives up. However, there might be some cases which the best solution shows up like this: `ProjectionB <- OperatorA <- ProjectionC <- OperatorB` Another problem occurs when a projection is inserted into the input of some operator to decrease the load, however; that projection insertion may cause to change the indices of some columns at the output operators. In such a case, all operators object to a column index change must be rewritten. I would like to eliminate those kind of restrictions by extending the capabilities of the current rule, and I believe pushing down mechanism falls behind it. 2. I haven't deeply examined the rule in the logical plan, but I know the rule there is less complicated. However, after all the rules have been worked in a physical plan, it would be comforting for such a rule to double check the plan. > Maybe you could add an API like this: The new version API requirements are a little different. I re-request your opinions when the PR is ready. Thanks for your evaluation and suggestions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
