Bankim Bhavsar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16036
Change subject: WIP [perf] KUDU-3140 Heuristics to disable predicate evaluation for Bloom filter ...................................................................... WIP [perf] KUDU-3140 Heuristics to disable predicate evaluation for Bloom filter Column predicate evaluation can be expensive and ineffective column predicates can waste CPU. TPCH Q9 exhibits significant regression of 50-96% on enabling Bloom filter predicates. See KUDU-3140 for details. This change adds simple heuristic taken from HDFS scanner in Impala that basically checks for every 16 blocks and if a predicate has rejected less than 10% of the rows scanned then disables the predicate. The stats collection is enabled by default for all predicate types but enforcement is only enabled for Bloom filter predicate type. With Bloom filter predicate type, false positives are expected so client is expected to do further filtering to remove false positives. Kudu makes the decision to disable the predicate independently and doesn't inform the client in this change which is okay for Bloom filter given the rationale above. TODO: - Add/update tests - Determine any missing iterators or code paths - Update Bloom filter predicate client docs - Need to determine whether these heuristics don't slow down other predicates Change-Id: I10197800a01a1b34c7821ac879caf8d272cab8dd --- M src/kudu/common/generic_iterators.cc 1 file changed, 189 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/16036/1 -- To view, visit http://gerrit.cloudera.org:8080/16036 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I10197800a01a1b34c7821ac879caf8d272cab8dd Gerrit-Change-Number: 16036 Gerrit-PatchSet: 1 Gerrit-Owner: Bankim Bhavsar <[email protected]>
