Zoltán Borók-Nagy created IMPALA-9470:
-----------------------------------------
Summary: Use Parquet bloom filters
Key: IMPALA-9470
URL: https://issues.apache.org/jira/browse/IMPALA-9470
Project: IMPALA
Issue Type: New Feature
Reporter: Zoltán Borók-Nagy
PARQUET-41 has been closed recently. That means Parquet-MR is capable of
writing and reading bloom filters.
Currently bloom filters per column chunk entries, this means with their help we
can filter out entire row groups.
We already filter row groups in HdfsParquetScanner::NextRowGroup() based on
column chunk statistics and dictionaries. Skipping row groups based on bloom
filters could be also added to this funciton.
Impala could also write bloom filters.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]