[
https://issues.apache.org/jira/browse/PARQUET-2260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mars updated PARQUET-2260:
--------------------------
Description:
If `parquet.bloom.filter.max.bytes` configuration is not a power of 2 value,
the size of the bloom filter generated will exceed this value.
For example, now if set `parquet.bloom.filter.max.bytes` as 1024*1024+1=
1048577 , the bytes size of bloom filter generated will be 2097152.
But the correct way is to set this value to the largest power of two less than
1048577. It should be 1024*1024
was:
If `parquet.bloom.filter.max.bytes` configuration is not a power of 2 value,
the size of the bloom filter generated will exceed this value.
For example, now if set `parquet.bloom.filter.max.bytes` as 1024*1024+1=
1048577 , the bloom filter generated will be 2097152.
But the correct way is to set this value to the largest power of two less than
1048577. It should be 1024*1024
> Bloom filter bytes size should't be larger than maxBytes size in the
> configuration
> -----------------------------------------------------------------------------------
>
> Key: PARQUET-2260
> URL: https://issues.apache.org/jira/browse/PARQUET-2260
> Project: Parquet
> Issue Type: Bug
> Reporter: Mars
> Assignee: Mars
> Priority: Major
>
> If `parquet.bloom.filter.max.bytes` configuration is not a power of 2 value,
> the size of the bloom filter generated will exceed this value.
> For example, now if set `parquet.bloom.filter.max.bytes` as 1024*1024+1=
> 1048577 , the bytes size of bloom filter generated will be 2097152.
> But the correct way is to set this value to the largest power of two less
> than 1048577. It should be 1024*1024
--
This message was sent by Atlassian Jira
(v8.20.10#820010)