[ 
https://issues.apache.org/jira/browse/IMPALA-9470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy updated IMPALA-9470:
--------------------------------------
    Description: 
PARQUET-41 has been closed recently. This means Parquet-MR is capable of 
writing and reading bloom filters.

Currently bloom filters are per column chunk entries, i.e. with their help we 
can filter out entire row groups.

We already filter row groups in HdfsParquetScanner::NextRowGroup() based on 
column chunk statistics and dictionaries. Skipping row groups based on bloom 
filters could be also added to this funciton.

Impala could also write bloom filters.

  was:
PARQUET-41 has been closed recently. That means Parquet-MR is capable of 
writing and reading bloom filters.

Currently bloom filters per column chunk entries, this means with their help we 
can filter out entire row groups.

We already filter row groups in HdfsParquetScanner::NextRowGroup() based on 
column chunk statistics and dictionaries. Skipping row groups based on bloom 
filters could be also added to this funciton.

Impala could also write bloom filters.


> Use Parquet bloom filters
> -------------------------
>
>                 Key: IMPALA-9470
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9470
>             Project: IMPALA
>          Issue Type: New Feature
>            Reporter: Zoltán Borók-Nagy
>            Priority: Major
>
> PARQUET-41 has been closed recently. This means Parquet-MR is capable of 
> writing and reading bloom filters.
> Currently bloom filters are per column chunk entries, i.e. with their help we 
> can filter out entire row groups.
> We already filter row groups in HdfsParquetScanner::NextRowGroup() based on 
> column chunk statistics and dictionaries. Skipping row groups based on bloom 
> filters could be also added to this funciton.
> Impala could also write bloom filters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to