[GitHub] [arrow-datafusion] xudong963 edited a comment on pull request #2026: Merge adjacent filter rule for optimizer

GitBox Sat, 19 Mar 2022 03:34:19 -0700


xudong963 edited a comment on pull request #2026:
URL: 
https://github.com/apache/arrow-datafusion/pull/2026#issuecomment-1072983873



   The following is my test code used by sql.
   
   ```rust
   #[tokio::test]
   async fn main() -> Result<()> {
       // create local execution context
       let mut ctx = SessionContext::new();
   
       // register csv file with the execution context
       ctx.register_csv("test", "tests/aggregate_simple.csv", 
CsvReadOptions::new())
           .await?;
   
       // execute the query
       let plan = ctx.create_logical_plan(
           "select c1, c2 from test where c3 = true and c2 = 0.000001",
       )?;
   
       dbg!(plan);
   
       Ok(())
   }
   ```
   Then I got the plan
   ```shell
   Projection: #test.c1, #test.c2
     Filter: #test.c3 = Boolean(true) AND #test.c2 = Float64(0.000001)
       TableScan: test projection=Non
   ```
   
   I also tested use datafusion-cli:
   ```sql
   ❯ create table t as SELECT * FROM (VALUES (1,true), (2,false)) as t;
   0 rows in set. Query took 0.003 seconds.
   ❯ select * from t;
   +---------+---------+
   | column1 | column2 |
   +---------+---------+
   | 1       | true    |
   | 2       | false   |
   +---------+---------+
   2 rows in set. Query took 0.002 seconds.
   ❯ explain select * from t where column1 = 2 and column2 = true;
   
+---------------+-------------------------------------------------------------------+
   | plan_type     | plan                                                       
       |
   
+---------------+-------------------------------------------------------------------+
   | logical_plan  | Projection: #t.column1, #t.column2                         
       |
   |               |   Filter: #t.column1 = Int64(2) AND #t.column2             
       |
   |               |     TableScan: t projection=Some([0, 1])                   
       |
   | physical_plan | ProjectionExec: expr=[column1@0 as column1, column2@1 as 
column2] |
   |               |   CoalesceBatchesExec: target_batch_size=4096              
       |
   |               |     FilterExec: column1@0 = 2 AND column2@1                
       |
   |               |       RepartitionExec: partitioning=RoundRobinBatch(12)    
       |
   |               |         MemoryExec: partitions=1, partition_sizes=[1]      
       |
   |               |                                                            
       |
   
+---------------+-------------------------------------------------------------------+
   2 rows in set. Query took 0.004 seconds.
   ```
   
   Two cases will result in adjacent filters in logical plan:
   - Use dataframe: `df.xxx.filter(filter1).filter(filter2)`;
   - Directly build logical plan by `LogicalPlanBuilder`: 
`LogicalPlanBuilder::from(xx).xxx.filter().filter()...` 
   
   Btw: I also checked pg & cockroach & materialize codebase, they don't have 
the rule.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] xudong963 edited a comment on pull request #2026: Merge adjacent filter rule for optimizer

Reply via email to