sgrebnov commented on PR #13267: URL: https://github.com/apache/datafusion/pull/13267#issuecomment-2465963391
@findepi - sorry, my bad – I should have provided more context We use Datafusion with [DataFusion Federation](https://github.com/spiceai/datafusion-federation/) to convert user queries into a LogicalPlan, then detect which parts of the plan belong to external execution engines. These parts are converted to SQL (unparsed) and executed by remote execution engines as part of the overall query execution. For example, in the scenario below, parts of the LogicalPlan are executed using external engines (MySQL and PostgreSQL) via unparsing corresponding sub-plans, with final aggregations or joins processed by DataFusion. All of this happens as part of DataFusion’s execution logic. If there are multiple external engines involved, only parts of the main plan are converted (see example below), so when we don’t have optimized/pushed-down projections, we end up fetching all columns. With projections optimization we propagate required columns to child nodes so only required columns could be fetched. Thus, the goal is to have projection columns pruning optimization enabled and to be able to unparse the logical plan back to SQL afterward. Please let me know if I should elaborate more on the challenges with the unparser after the optimization rule are applied. ``` ┌────────────────────────┐ │ Join / Aggregation │ B and C are └────────────────────────┘ available in an ▲ external database │ DBMS-2 (PostreSQL) │ A is available in an │ Unparse -> SQL external database in ├─────────────────────┐ DBMS-1 (MySQL). │┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─│─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐ ┌────────────┘ │ Unparse -> SQL │ │ │ │ │ ┌───────┴──────┐ │ │ │ Join │ │ ┌ ─ ─ ─ ─ ─ ─ ┼ ─ ─ ─ ─ ─ ┐ └───────▲──────┘ │ │ │ │ │ │ │ ┌─────────┴──────────┐ ┌────────┴───────┐ │ │ │ │ │ │ Scan A │ │ │ │ └────────────────┘ │ ┌────────────────┐ ┌────────────────┐ │ │ │ │ Scan B │ │ Scan C │ │ └────────────────┘ └────────────────┘ │ │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org