Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/19506 )
Change subject: IMPALA-11924: Cap runtime filter NDV with build key NDV ...................................................................... Patch Set 1: (1 comment) Hi Csaba, I have a question. http://gerrit.cloudera.org:8080/#/c/19506/1/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java: http://gerrit.cloudera.org:8080/#/c/19506/1/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@665 PS1, Line 665: buildKeyNdv_ = srcExpr_.getNumDistinctValues(); My understanding about bloom filter is that it is better to have slightly bigger size than too small size that can lead to higher false-positives. In the JIRA, you have example of 8X reduction in size from 64KB to 8KB. How accurate is the NDV information to warrant such reduction? Is it OK if NDV is being considered if the resulting filter size will go above certain threshold without it (say, only consider NDV if filter size will cross above 512KB)? -- To view, visit http://gerrit.cloudera.org:8080/19506 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idaa46789663cb2e6d29f518757d89c85ff8e4d1a Gerrit-Change-Number: 19506 Gerrit-PatchSet: 1 Gerrit-Owner: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Comment-Date: Thu, 16 Feb 2023 17:22:52 +0000 Gerrit-HasComments: Yes
