my-vegetable-has-exploded commented on issue #8685: URL: https://github.com/apache/arrow-datafusion/issues/8685#issuecomment-1884107159
Hi @domyway, you can check whether bloom filter works by `row_groups_pruned_bloom_filter` metric now. In my environment, bloom filter works. ``` ❯ CREATE EXTERNAL TABLE taxi STORED AS PARQUET LOCATION '/home/deepin/rust/arrow-datafusion/parquet-testing/data/data_index_bloom_encoding_stats.parquet' ; 0 rows in set. Query took 0.002 seconds. ❯ SET datafusion.execution.parquet.bloom_filter_enabled to true; 0 rows in set. Query took 0.001 seconds. ❯ EXPLAIN ANALYZE SELECT * FROM taxi WHERE (taxi."String" IN ('bb', 'bbc', 'bba', 'bbd', 'bbg', 'bbf', 'bbn', 'nnfa', 'bbnfd', 'bbx', 'bbxda', 'badfas', 'afd', 'adfas', 'adfa', 'asdfer', 'sefarj', 'erseioio', 'uioosdf', '0ba24')); ....row_groups_pruned_bloom_filter=1, ..... ``` Thank you for finding it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org