[jira] [Commented] (PARQUET-1815) Add union API to BloomFilter interface

Gabor Szadovszky (Jira) Wed, 18 Mar 2020 02:38:50 -0700


    [ 
https://issues.apache.org/jira/browse/PARQUET-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17061545#comment-17061545
 ]


Gabor Szadovszky commented on PARQUET-1815:
-------------------------------------------

The currently implemented filters in parquet-mr (e.g. dictionary filter, column 
indexes) are created for internal use. It means that the user does not have to 
care about them, it simply sets the filter and gets the values required without 
knowing which filter implementation is dropping the unneeded values.
What is not clear to me in this jira is that how the user would benefit from 
the union of the bloom filters.

> Add union API to BloomFilter interface
> --------------------------------------
>
>                 Key: PARQUET-1815
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1815
>             Project: Parquet
>          Issue Type: Improvement
>            Reporter: Junjie Chen
>            Priority: Minor
>              Labels: pull-request-available
>
> Sometimes, one may want to build a file-level bloom filter by union all row 
> groups bloom filters so that to save some memory. Add a union API that could 
> make it easy to use.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PARQUET-1815) Add union API to BloomFilter interface

Reply via email to