Marcel Kornacker has posted comments on this change. Change subject: IMPALA-3007: Adjust Bloom Filter size according to NDV estimate ......................................................................
Patch Set 4: (3 comments) http://gerrit.cloudera.org:8080/#/c/2812/4/be/src/exec/partitioned-hash-join-node.cc File be/src/exec/partitioned-hash-join-node.cc: Line 522: ctx.filter->filter_desc().ndv_estimate); > It's a bit more convenient to put it in the RuntimeFilter itself (that way even better. Line 524: bool fp_rate_too_high = > Only in the parquet scanner. Not clear if there'd be an advantage to doing you can do it on batch boundaries and leave out the branch in the fast path. i'd say leave a todo. disabling useless filters is definitely a perf gain. http://gerrit.cloudera.org:8080/#/c/2812/4/testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters_phj.test File testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters_phj.test: Line 6: # consumption / spilling behaviour. > Not sure exactly where you mean: this is the only query this applies to. (T can you indicate at the top why this is in a separate file? -- To view, visit http://gerrit.cloudera.org:8080/2812 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1fe37b8d4cfb3c52bb8e8cf0ca55e92665b87803 Gerrit-PatchSet: 4 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Henry Robinson <[email protected]> Gerrit-Reviewer: Henry Robinson <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Mostafa Mokhtar <[email protected]> Gerrit-HasComments: Yes
