timsaucer commented on issue #1551:
URL: 
https://github.com/apache/datafusion-python/issues/1551#issuecomment-4487543692

   Ok, it looks like this is happening in the provider not in datafusion-python 
library as I had originally expected. What is happening is that we have a 
`Column` physical expression that is failing to simplify because 
`simplify_const_expr_immediate` cannot correctly downcast it to `Column` since 
it's foreign at that point.
   
   This was an unintended side effect of 
https://github.com/apache/datafusion/pull/18916
   
   Here is what is happening:
   
   - The table provider is getting a push down filter expression (logical) and 
a '&dyn Session' that is the `datafusion-python` session context.
   - The `ParquetSource` it is using under the hood takes a `PhysicalExpr` as 
it's predicate. The `PhysicalExpr` is created by the `datafusion-python` 
session context during the call to `create_physical_expr`. This PhysicalExpr 
originates in the `datafusion-python` library so it is foreign in terms of the 
user library. That is, from the user library perspective it will be a 
`ForeignPhysicalExpr`
   - On the user library we are getting calls to `simplify`. It looks like this 
happens both in open and in row group filter (which is where we're hitting it 
here).
   - `simplify` checks to see if it is a `Column` during 
`simplify_const_expr_immediate` however it cannot downcast to `Column` because 
we are in the user library NOT the main datafusion-python library when this 
simplify gets called.
   
   Here is a work around I have tested with this code
   
   ```
           let execution_props = ExecutionProps::new();
           let predicate = predicate
               .map(|predicate| {
                   datafusion::physical_expr::create_physical_expr(
                       &predicate,
                       &df_schema,
                       &execution_props,
                   )
               })
               .transpose()?
               // if there are no filters, use a literal true to have a 
predicate
               // that always evaluates to true we can pass to the index
               .unwrap_or_else(|| 
datafusion::physical_expr::expressions::lit(true));
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to