[
https://issues.apache.org/jira/browse/PARQUET-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17061545#comment-17061545
]
Gabor Szadovszky commented on PARQUET-1815:
-------------------------------------------
The currently implemented filters in parquet-mr (e.g. dictionary filter, column
indexes) are created for internal use. It means that the user does not have to
care about them, it simply sets the filter and gets the values required without
knowing which filter implementation is dropping the unneeded values.
What is not clear to me in this jira is that how the user would benefit from
the union of the bloom filters.
> Add union API to BloomFilter interface
> --------------------------------------
>
> Key: PARQUET-1815
> URL: https://issues.apache.org/jira/browse/PARQUET-1815
> Project: Parquet
> Issue Type: Improvement
> Reporter: Junjie Chen
> Priority: Minor
> Labels: pull-request-available
>
> Sometimes, one may want to build a file-level bloom filter by union all row
> groups bloom filters so that to save some memory. Add a union API that could
> make it easy to use.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)