alamb commented on pull request #7880:
URL: https://github.com/apache/arrow/pull/7880#issuecomment-674429867


   I have had this nagging sensation that the algorithm isn't quite right when 
the same column is used multiple times. I finally came up with an example that 
shows part of what I have been worrying about. 
   
   Here is a new test that passes on this branch but I think is incorrect. 
Specifically, with two `Selection`s for the same variable separated by a 
`Limit`, one of the `Selection`s is  lost with the algorithm as written.  Am I 
missing something?
   
   ```
   
       #[test]
       fn filter_2_breaks_limits() -> Result<()> {
           let table_scan = test_table_scan()?;
           let plan = LogicalPlanBuilder::from(&table_scan)
               .project(vec![col("a")])?
               .filter(col("a").lt_eq(&Expr::Literal(ScalarValue::Int64(1))))?
               .limit(1)?
               .project(vec![col("a")])?
               .filter(col("a").gt_eq(&Expr::Literal(ScalarValue::Int64(1))))?
               .build()?;
           // Should be able to move both filters below the projections
   
           // not part of the test
           assert_eq!(
               format!("{:?}", plan),
               "Selection: #a GtEq Int64(1)\
                \n  Projection: #a\
                \n    Limit: 1\
                \n      Selection: #a LtEq Int64(1)\
                \n        Projection: #a\
                \n          TableScan: test projection=None"
           );
   
           // This just seems wong: we lost a selection....
           let expected = "\
           Projection: #a\
           \n  Selection: #a GtEq Int64(1)\
           \n    Limit: 1\
           \n      Projection: #a\
           \n        TableScan: test projection=None";
   
           assert_optimized_plan_eq(&plan, expected);
           Ok(())
       }
   ```
   
   FYI @jorgecarleitao 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to