Miha-Cancula-Flarion opened a new pull request, #23245: URL: https://github.com/apache/datafusion/pull/23245
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Closes #. ## Rationale for this change When https://github.com/apache/parquet-java/ writes a bloom filter for a boolean column, it does not actually update the values, so the filter ends up empty. The DataFusion reader then incorrectly assumes that such a file contains no values, and skips it while reading. ## What changes are included in this PR? This change makes is so that we always assume that a boolean column has values, essentially ignoring the filter. ## Are these changes tested? Not yet. ## Are there any user-facing changes? This may affect performance in cases where the SBBF was written correctly, and thus legitimately excludes some data files. With this change, those files will still be scanned. There are no changes to the API. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
