alamb opened a new issue, #4046: URL: https://github.com/apache/arrow-datafusion/issues/4046
**Describe the bug** DataFusion generates an error for some predicates when predicate pushdown is enabled. NOTE This is the same symptom as reported on https://github.com/apache/arrow-datafusion/issues/4006 but with a different predicate NOTE that pushdown filtering is not enabled by default (as we are still working on it) so this issue will not likely affect users: **To Reproduce** 1. Download data from [repro.zip](https://github.com/apache/arrow-datafusion/files/9902718/repro.zip) 2. Run datafusion CLI The query run is ```sql select count(*) from foo where request_method != 'GET' OR response_status = 400 OR service = 'backend'; ``` I tested is using master at https://github.com/apache/arrow-datafusion/commit/35f786bb6ce33cbd58db3e16a46958b58f7676f4, which includes the fix for #4006 in https://github.com/apache/arrow-datafusion/commit/5cf090a13391501c0ce7707ac7a1e50e18517b79 ```shell $ git status Your branch is up to date with 'apache/master'. nothing to commit, working tree clean $ git rev-parse HEAD 5cf090a13391501c0ce7707ac7a1e50e18517b79 ``` **Expected behavior** Same answer should be produced with and without row filtering enabled. However, with row filtering an error is produced ```shell datafusion-cli -f script.sql +-----------------+ | COUNT(UInt8(1)) | +-----------------+ | 53819 | +-----------------+ 1 row in set. Query took 0.006 seconds. ``` With it enabled: ```shell DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true datafusion-cli -f script.sql ... 1 row in set. Query took 0.021 seconds. ArrowError(ExternalError(Execution("Arrow error: External error: Arrow: underlying Arrow error: Compute error: Error evaluating filter predicate: Internal(\"Cannot evaluate binary expression NotEq with types UInt16 and Utf8\")"))) ``` **Additional context** Found by the test here https://github.com/apache/arrow-datafusion/pull/3976 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
