[ 
https://issues.apache.org/jira/browse/PARQUET-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17693000#comment-17693000
 ] 

ASF GitHub Bot commented on PARQUET-2251:
-----------------------------------------

yabola commented on PR #1033:
URL: https://github.com/apache/parquet-mr/pull/1033#issuecomment-1442792170

   @wgtmac @gerashegalov Please take a look, thank you~ 
   And I will update  [PR](https://github.com/apache/parquet-mr/pull/1023) to 
skip bloomfilter when all pages are encoded in dictionary (because there are 
already many wrong parquet files generated)




> Avoid generating Bloomfilter when all pages of a column are encoded by 
> dictionary
> ---------------------------------------------------------------------------------
>
>                 Key: PARQUET-2251
>                 URL: https://issues.apache.org/jira/browse/PARQUET-2251
>             Project: Parquet
>          Issue Type: Bug
>            Reporter: Mars
>            Priority: Major
>
> In parquet pageV1, even all pages of a column are encoded by dictionary, it 
> will still generate BloomFilter. Actually it is unnecessary to generate 
> BloomFilter and it cost time and occupy storage.
> Parquet pageV2 doesn't generate BloomFilter if all pages of a column are 
> encoded by dictionary,



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to