goldmedal commented on PR #13267: URL: https://github.com/apache/datafusion/pull/13267#issuecomment-2462745264
Thanks, @sgrebnov. I'm still reviewing. I wonder if it could be the default behavior for the `optimize_projections` rule instead of an option 🤔. I tested some cases and found that the generated physical plans are the same whether this config is enabled. ```rust let mut config = ConfigOptions::new(); config .optimizer .optimize_projections_preserve_existing_projections = true; let state = SessionConfig::from(config); let ctx = SessionContext::new_with_config(state); ctx.sql("CREATE TABLE test (a int, b int)").await.unwrap(); let sql = "SELECT a, b FROM (SELECT a, b, count(*) from (SELECT a, b from test) group by a, b)"; ctx.sql(sql).await.unwrap().explain(false, false).unwrap().show().await.unwrap(); let mut config = ConfigOptions::new(); config .optimizer .optimize_projections_preserve_existing_projections = false; let state = SessionConfig::from(config); let ctx = SessionContext::new_with_config(state); ctx.sql("CREATE TABLE test (a int, b int)").await.unwrap(); let sql = "SELECT a, b FROM (SELECT a, b, count(*) from (SELECT a, b from test) group by a, b)"; ctx.sql(sql).await.unwrap().explain(false, false).unwrap().show().await.unwrap(); ``` The result is ``` Enable: optimize_projections_preserve_existing_projections +---------------+--------------------------------------------------------------------------+ | plan_type | plan | +---------------+--------------------------------------------------------------------------+ | logical_plan | Projection: test.a, test.b | | | Aggregate: groupBy=[[test.a, test.b]], aggr=[[]] | | | Projection: test.a, test.b | | | TableScan: test projection=[a, b] | | physical_plan | AggregateExec: mode=SinglePartitioned, gby=[a@0 as a, b@1 as b], aggr=[] | | | MemoryExec: partitions=1, partition_sizes=[0] | | | | +---------------+--------------------------------------------------------------------------+ Disable optimize_projections_preserve_existing_projections +---------------+--------------------------------------------------------------------------+ | plan_type | plan | +---------------+--------------------------------------------------------------------------+ | logical_plan | Aggregate: groupBy=[[test.a, test.b]], aggr=[[]] | | | TableScan: test projection=[a, b] | | physical_plan | AggregateExec: mode=SinglePartitioned, gby=[a@0 as a, b@1 as b], aggr=[] | | | MemoryExec: partitions=1, partition_sizes=[0] | | | | +---------------+--------------------------------------------------------------------------+ ``` I'm not so familiar with the physical planner but I think if it doesn't cause too much effort for the physical planning, we can make it preserve the projection by default. @alamb could you take a look? Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org