[jira] [Commented] (HIVE-24831) Support writing bloom filters in Parquet

Gabor Szadovszky (Jira) Thu, 25 Feb 2021 08:26:07 -0800


    [ 
https://issues.apache.org/jira/browse/HIVE-24831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291028#comment-17291028
 ]


Gabor Szadovszky commented on HIVE-24831:
-----------------------------------------

See https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/README.md 
for details of the required configuration. Search for {{bloom.filter}}.

> Support writing bloom filters in Parquet
> ----------------------------------------
>
>                 Key: HIVE-24831
>                 URL: https://issues.apache.org/jira/browse/HIVE-24831
>             Project: Hive
>          Issue Type: New Feature
>          Components: Parquet
>            Reporter: Csaba Ringhofer
>            Priority: Major
>
> Parquet-mr 1.12.0 will add support for reading and writing Bloom filters.
> Reading doesn't need any action from Hive side, as it will be applied 
> automatically if there is an Eq predicate on a column and the file contains a 
> bloom filter.
> Writing needs some configuration, as Parquet-mr doesn't write bloom filters 
> by default.
> Similar table properties could be used as in ORC, e.g.  
> 'orc.bloom.filter.columns' to set the columns where Parquet-mr should write 
> bloom filters. The same table property could be used by both Hive and Impala 
> for the same purpose.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24831) Support writing bloom filters in Parquet

Reply via email to