kumarUjjawal commented on issue #19866:
URL: https://github.com/apache/datafusion/issues/19866#issuecomment-3765438348

   Hi @geoHeil Thank you for the detailed report. I looked into it and I think 
this is actually working as designed. The `arrow_cast` function is a 
planning-time only construct, it's meant to be rewritten to a standard `CAST` 
during DataFusion's optimizer pass and never actually executed.
   
   When you run a query through DataFusion normally, the simplifier converts:
   
   ```
   arrow_cast('2024-01-01', 'Timestamp(Microsecond, Some("UTC"))')
   
   ```
   into:
   
   ```
   CAST('2024-01-01' AS Timestamp(Microsecond, Some("UTC")))
   ```
   
   The issue is that deltalake is taking your predicate string and executing it 
directly without running it through DataFusion's optimizer/simplifier. That's 
why it hits the "should have been simplified" error.
   
   The fix should be in deltalake, they'd need to run predicates through 
`ExprSimplifier` before execution, or handle `ARROW_CAST` syntax on their end.
   
   Your workaround of using dialect="postgres" is the right approach for now 
since it generates standard SQL cast syntax that works directly.
   
   cc @alamb 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to