[ 
https://issues.apache.org/jira/browse/PARQUET-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15001854#comment-15001854
 ] 

Ferdinand Xu edited comment on PARQUET-41 at 11/12/15 8:47 AM:
---------------------------------------------------------------

Hi [~rdblue], 
I have updated two related PRs(https://github.com/apache/parquet-format/pull/28 
and https://github.com/apache/parquet-mr/pull/215). Could you help me review 
them? Thank you.
In the latest patch, the bloom filter bitset will not be stored in page level. 
It will reduce the extra space significantly.



was (Author: ferd):
Hi [~rdblue], 
I have updated two related PRs(https://github.com/apache/parquet-format/pull/28 
and https://github.com/apache/parquet-mr/pull/215). Could you help me review 
them? Thank you.


> Add bloom filters to parquet statistics
> ---------------------------------------
>
>                 Key: PARQUET-41
>                 URL: https://issues.apache.org/jira/browse/PARQUET-41
>             Project: Parquet
>          Issue Type: New Feature
>          Components: parquet-format, parquet-mr
>            Reporter: Alex Levenson
>            Assignee: Ferdinand Xu
>              Labels: filter2
>
> For row groups with no dictionary, we could still produce a bloom filter. 
> This could be very useful in filtering entire row groups.
> Pull request:
> https://github.com/apache/parquet-mr/pull/215



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to