Henry Robinson has submitted this change and it was merged. Change subject: IMPALA-3141: Send dummy filters when filter production is disabled ......................................................................
IMPALA-3141: Send dummy filters when filter production is disabled The PHJ may disable runtime filter production for one of several reasons, including a predicted high false-positive rate. If the filters are not produced, any scans will wait for their entire timeout before continuing. This patch changes the filter logic to always send a filter, even if one wasn't actually produced by the PHJ. To preserve correctness, that filter must contain every element of the set. Such a filter is represented by (BloomFilter*)NULL. This allows us to make no changes to RuntimeFilter::Eval(), which already returns true if the member Bloom filter is NULL. In RPCs, a new field is added to TBloomFilter to identify filters that are always true. The HdfsParquetScanner checks to see if filters would always return true for any element, and disables them if so. There is some miscellaneous cleanup in this patch, particularly the removal of unused members in BloomFilter. This patch has been manually tested on queries that would otherwise take a long time to time-out. A unit test was added to ensure that queries do not wait. Change-Id: I04b3e6542651c1e7b77a9bab01d0e3d9506af42f Reviewed-on: http://gerrit.cloudera.org:8080/2475 Tested-by: Internal Jenkins Reviewed-by: Henry Robinson <[email protected]> --- M be/src/benchmarks/bloom-filter-benchmark.cc M be/src/exec/blocking-join-node.cc M be/src/exec/blocking-join-node.h M be/src/exec/hash-join-node.cc M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/runtime/coordinator.cc M be/src/runtime/runtime-filter.cc M be/src/runtime/runtime-filter.h M be/src/runtime/runtime-filter.inline.h M be/src/util/bloom-filter-test.cc M be/src/util/bloom-filter.cc M be/src/util/bloom-filter.h M be/src/util/cpu-info.cc M be/src/util/cpu-info.h M common/thrift/ImpalaInternalService.thrift M common/thrift/PlanNodes.thrift M fe/src/main/java/com/cloudera/impala/planner/HashJoinNode.java M fe/src/main/java/com/cloudera/impala/planner/PlanFragment.java M testdata/workloads/functional-query/queries/QueryTest/runtime_filters_wait.test 22 files changed, 226 insertions(+), 246 deletions(-) Approvals: Henry Robinson: Looks good to me, approved Internal Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/2475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I04b3e6542651c1e7b77a9bab01d0e3d9506af42f Gerrit-PatchSet: 14 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Henry Robinson <[email protected]> Gerrit-Reviewer: Henry Robinson <[email protected]> Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker <[email protected]>
