[
https://issues.apache.org/jira/browse/IMPALA-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17234065#comment-17234065
]
Riza Suminto commented on IMPALA-6311:
--------------------------------------
I think reducing the default target FPP to 10% is still reasonable.
>From my observation, setting 10% FPP will increase the target bloom filter
>size to 2x compared to 75% FPP.
TPC-DS runs against 30TB TPC-DS cluster with 10% FPP and
RUNTIME_FILTER_MAX_SIZE up to 8MB runs well without significant regression.
> Evaluate smaller FPP for Bloom filters
> --------------------------------------
>
> Key: IMPALA-6311
> URL: https://issues.apache.org/jira/browse/IMPALA-6311
> Project: IMPALA
> Issue Type: Task
> Components: Perf Investigation
> Reporter: Jim Apple
> Assignee: Riza Suminto
> Priority: Major
>
> The Bloom filters are created by estimating the NDV and then using the FPP of
> 75% to get the right size for the filter. This is may be too high to be very
> useful - if our filters are currently filtering more than 75% out, then it is
> only because we are overestimating NDV.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]