stuartcarnie opened a new issue, #6072:
URL: https://github.com/apache/arrow-datafusion/issues/6072

   ### Describe the bug
   
   Predicates which combine multiple `true` literals using a disjunction 
operator followed by a conjunction are incorrectly simplified, causing 
incorrect results.
   
   For example, executing the following query:
   
   ```sql
   WITH t(time, cpu) AS (VALUES (0, 'cpu0'), (1, 'cpu1')) SELECT * from t WHERE 
(time = 0 OR time = 1) AND (true OR true AND cpu = 'cpu0');
   ```
   
   Produces the following, incorrect results, as the row containing `cpu1` for 
the column `cpu` should not be included:
   
   ```
   +------+------+
   | time | cpu  |
   +------+------+
   | 0    | cpu0 |
   | 1    | cpu1 |
   +------+------+
   2 rows in set. Query took 0.004 seconds.
   ```
   
   ### To Reproduce
   
   ```sql
   WITH t(time, cpu) AS (VALUES (0, 'cpu0'), (1, 'cpu1')) SELECT * from t WHERE 
(time = 0 OR time = 1) AND (true OR true AND cpu = 'cpu0')
   ```
   
   ### Expected behavior
   
   Produce the following results:
   
   ```
   +------+------+
   | time | cpu  |
   +------+------+
   | 0    | cpu0 |
   +------+------+
   1 row in set. Query took 0.004 seconds.
   ```
   
   ### Additional context
   
   Running `EXPLAIN VERBOSE` indicates the `simplify_expressions` rule is 
incorrectly rewriting the filter node:
   
   ```text
   ❯ explain verbose WITH t(time, cpu) AS (VALUES (0, 'cpu0'), (1, 'cpu1')) 
SELECT * from t WHERE (time = 0 OR time = 1) AND (true OR true AND cpu = 
'cpu0');
   
+------------------------------------------------------------+------------------------------------------------------------------------------------------------------------+
   | plan_type                                                  | plan          
                                                                                
             |
   
+------------------------------------------------------------+------------------------------------------------------------------------------------------------------------+
   | initial_logical_plan                                       | Projection: 
time, cpu                                                                       
               |
   |                                                            |   Filter: 
(time = Int64(0) OR time = Int64(1)) AND (Boolean(true) OR Boolean(true) AND 
cpu = Utf8("cpu0")) |
   |                                                            |     
Projection: t.column1 AS time, t.column2 AS cpu                                 
                       |
   |                                                            |       
SubqueryAlias: t                                                                
                     |
   |                                                            |         
Values: (Int64(0), Utf8("cpu0")), (Int64(1), Utf8("cpu1"))                      
                   |
   | logical_plan after inline_table_scan                       | SAME TEXT AS 
ABOVE                                                                           
              |
   | logical_plan after type_coercion                           | SAME TEXT AS 
ABOVE                                                                           
              |
   | logical_plan after simplify_expressions                    | Projection: 
time, cpu                                                                       
               |
   |                                                            |   Filter: 
time = Int64(0) OR time = Int64(1)                                              
                 |
   |                                                            |     
Projection: t.column1 AS time, t.column2 AS cpu                                 
                       |
   |                                                            |       
SubqueryAlias: t                                                                
                     |
   |                                                            |         
Values: (Int64(0), Utf8("cpu0")), (Int64(1), Utf8("cpu1"))                      
                   |
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to