goldmedal commented on PR #13267:
URL: https://github.com/apache/datafusion/pull/13267#issuecomment-2462745264

   Thanks, @sgrebnov. I'm still reviewing. I wonder if it could be the default 
behavior for the `optimize_projections` rule instead of an option 🤔.
   
   I tested some cases and found that the generated physical plans are the same 
whether this config is enabled.
   ```rust
       let mut config = ConfigOptions::new();
       config
           .optimizer
           .optimize_projections_preserve_existing_projections = true;
       let state = SessionConfig::from(config);
       let ctx = SessionContext::new_with_config(state);
       ctx.sql("CREATE TABLE test (a int, b int)").await.unwrap();
       let sql = "SELECT a, b FROM (SELECT a, b, count(*) from (SELECT a, b 
from test) group by a, b)";
       ctx.sql(sql).await.unwrap().explain(false, 
false).unwrap().show().await.unwrap();
   
       let mut config = ConfigOptions::new();
       config
           .optimizer
           .optimize_projections_preserve_existing_projections = false;
       let state = SessionConfig::from(config);
       let ctx = SessionContext::new_with_config(state);
       ctx.sql("CREATE TABLE test (a int, b int)").await.unwrap();
       let sql = "SELECT a, b FROM (SELECT a, b, count(*) from (SELECT a, b 
from test) group by a, b)";
       ctx.sql(sql).await.unwrap().explain(false, 
false).unwrap().show().await.unwrap();
   ```
   The result is
   ```
   Enable: optimize_projections_preserve_existing_projections
   
+---------------+--------------------------------------------------------------------------+
   | plan_type     | plan                                                       
              |
   
+---------------+--------------------------------------------------------------------------+
   | logical_plan  | Projection: test.a, test.b                                 
              |
   |               |   Aggregate: groupBy=[[test.a, test.b]], aggr=[[]]         
              |
   |               |     Projection: test.a, test.b                             
              |
   |               |       TableScan: test projection=[a, b]                    
              |
   | physical_plan | AggregateExec: mode=SinglePartitioned, gby=[a@0 as a, b@1 
as b], aggr=[] |
   |               |   MemoryExec: partitions=1, partition_sizes=[0]            
              |
   |               |                                                            
              |
   
+---------------+--------------------------------------------------------------------------+
   Disable optimize_projections_preserve_existing_projections
   
+---------------+--------------------------------------------------------------------------+
   | plan_type     | plan                                                       
              |
   
+---------------+--------------------------------------------------------------------------+
   | logical_plan  | Aggregate: groupBy=[[test.a, test.b]], aggr=[[]]           
              |
   |               |   TableScan: test projection=[a, b]                        
              |
   | physical_plan | AggregateExec: mode=SinglePartitioned, gby=[a@0 as a, b@1 
as b], aggr=[] |
   |               |   MemoryExec: partitions=1, partition_sizes=[0]            
              |
   |               |                                                            
              |
   
+---------------+--------------------------------------------------------------------------+
   ```
   
   I'm not so familiar with the physical planner but I think if it doesn't 
cause too much effort for the physical planning, we can make it preserve the 
projection by default.
   
   @alamb could you take a look? Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to