tustvold commented on issue #3463: URL: https://github.com/apache/datafusion/issues/3463#issuecomment-3708276942
> After each filter for each RecordBatch is evaluated we re-order them and possibly toss the ones with poor selectivity back into the scan phase. I believe this is what https://github.com/apache/arrow-rs/pull/8733 does (or at least what I intended for it to do when I filed the original ticket) - based on the selectivity of the predicate it switches between using RowSelection, which pushes the "filter" down to the actual parquet decode, or using a BooleanBuffer that instead filters the array after decode. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
