alamb opened a new issue, #4006: URL: https://github.com/apache/arrow-datafusion/issues/4006
**Describe the bug** DataFusion gets different answers when parquet pushdown is enabled NOTE that pushdown filtering is not enabled by default (as we are still working on it) so this issue will not likely affect users: **To Reproduce** 1. Download data from [repro.zip](https://github.com/apache/arrow-datafusion/files/9890904/repro.zip) 2. Run datafusion CLI The query run is ```sql select count(*) from foo where request_duration_ns > 791684060 OR client_addr NOT in ('213.120.214.213'); ``` **Expected behavior** Same answer should be produced with and without row filtering enabled. However, with row filtering an error is produced ```shell datafusion-cli -f script.sql +-----------------+ | COUNT(UInt8(1)) | +-----------------+ | 53819 | +-----------------+ 1 row in set. Query took 0.006 seconds. ``` With it enabled: ```shell DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true datafusion-cli -f script.sql ... 1 row in set. Query took 0.002 seconds. ArrowError(ExternalError(Execution("Arrow error: External error: Arrow: underlying Arrow error: Compute error: Error evaluating filter predicate: Internal(\"Cannot evaluate binary expression Gt with types Utf8 and Int32\")"))) ``` **Additional context** Found by the test here https://github.com/apache/arrow-datafusion/pull/3976 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org