sarahyurick opened a new issue, #6432:
URL: https://github.com/apache/arrow-datafusion/issues/6432

   ### Describe the bug
   
   In the Dask-SQL project, we have relied on DataFusion to create `IS NOT 
NULL` filters at the `TableScan` level whenever a column is involved in a join. 
However, it looks like recent changes may have removed this feature?
   
   ### To Reproduce
   
   The query
   ```
   SELECT d_col
          FROM c_table
          JOIN d_table ON d_col=c_col
   ```
   has the `LogicalPlan`
   ```
   Projection: d_table.d_col
     Inner Join:  Filter: d_table.d_col = c_table.c_col
       TableScan: c_table projection=[c_col]
       TableScan: d_table projection=[d_col]
   ```
   
   ### Expected behavior
   
   It still works when we write the query with a `WHERE` clause.
   ```
   SELECT d_col
          FROM c_table, d_table WHERE d_col=c_col
   ```
   produces
   ```
   Projection: d_table.d_col
     Inner Join: c_table.c_col= d_table.d_col
       TableScan: c_table projection=[c_col], full_filters=[c_table.c_col IS 
NOT NULL]
       TableScan: d_table projection=[d_col], full_filters=[d_table.d_col IS 
NOT NULL]
   ```
   
   ### Additional context
   
   I'm not quite sure when this change was introduced and if so, why? Is this 
something that DataFusion would be willing to fix, or would it be preferred 
that Dask-SQL re-adds the optimizer rule on our side?
   
   cc @ayushdg @jdye64 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to