Thanks for the additional context, but I don't quite get why a utility
class like this would need to make a call on what the maximum size of a
bloom filter should be in the format. That's really a write-side concern.
Can we just remove that code from the current PR and discuss it when we are
working on how to produce appropriately-configured bloom filters?

On Tue, Jun 26, 2018 at 4:09 PM 俊杰陈 <cjjnj...@gmail.com> wrote:

> Hi Ryan,
>
> The last comment on doc is  to provide a benchmark for dictionary vs Bloom
> filter, I provided benchmark result here
> <https://docs.google.com/spreadsheets/d/1yV3u-P_yY4DtfSty3LPrbhwuJx4cqm_YeK61s2v0OLU/edit?usp=sharing>,
> Jim have reviewed this and updated comments on JIRA also. You can access
> JIRA <https://issues.apache.org/jira/browse/PARQUET-41> to get latest
> status.
>
> We created some sub tasks for PARQUET-41, and first step [JIRA-1332
> <https://issues.apache.org/jira/browse/PARQUET-1332>] is to implement
> Bloom filter utility class itself in parquet-mr and paruqet-cpp. The
> question above is related to it.
>
>
>
> Ryan Blue <rb...@netflix.com.invalid> 于2018年6月27日周三 上午12:35写道:
>
>> I thought the plan was to finish the bloom filter spec and then decide how
>> to create appropriately sized filters. This sounds like a write-side
>> implementation detail to me. What is the current plan for getting this
>> work
>> in?
>>
>> On Mon, Jun 25, 2018 at 8:43 PM 俊杰陈 <cjjnj...@gmail.com> wrote:
>>
>> > Hi devs
>> >
>> > I'm now implementing bloom filter feature and need to set a default
>> maximum
>> > value for bloom filter size for a block. According to calculation here
>> > <
>> >
>> https://docs.google.com/spreadsheets/d/1LQqGZ1EQSkPBXtdi9nyANiQOhwNFwqiiFe8Sazclf5Y/edit#gid=0
>> > .>,
>> > I plan to set maximum size to 1/8 of parquet.block.size which can
>> achieve
>> > about 0.25 FPP in case of only one column of long type in a block and
>> all
>> > values are different.  What do you think about this?  Any feedback is
>> > welcome.
>> >
>> > --
>> > Thanks & Best Regards
>> >
>>
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix
>>
>
>
> --
> Thanks & Best Regards
>


-- 
Ryan Blue
Software Engineer
Netflix

Reply via email to