Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/15683 )
Change subject: IMPALA-3741 [part 2]: Push runtime bloom filter to Kudu ...................................................................... Patch Set 22: (23 comments) http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/codegen/gen_ir_descriptions.py File be/src/codegen/gen_ir_descriptions.py: http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/codegen/gen_ir_descriptions.py@204 PS22, Line 204: ["RAW_VALUE_GET_HASH_VALUE", > I don't think this is being used anywhere anymore, so it can be removed. Right, will remove this entry. http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/exec/kudu-scanner.cc File be/src/exec/kudu-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/exec/kudu-scanner.cc@247 PS22, Line 247: NULL > nit: we prefer nullptr, here and elsewhere Will fix it. http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/kudu/util/block_bloom_filter.h File be/src/kudu/util/block_bloom_filter.h: http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/kudu/util/block_bloom_filter.h@146 PS22, Line 146: // It's possible that log_heap_size is less than kLogBucketWordBits in unit test. > We definitely want to avoid making any changes to the be/src/kudu directory Will change bloom-filter-test.cc to avoid hitting this error. http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/runtime/raw-value.h File be/src/runtime/raw-value.h: http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/runtime/raw-value.h@86 PS22, Line 86: GetHashValueFastHash32 > I think you mean 'GetHashValueFastHash', i.e. this function doesn't return Right, fix it. http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/runtime/raw-value.h@93 PS22, Line 93: GetHashValueFastHash32NonNull > Same here - this doesn't return a 32-bit value, so I think you meant to say will fix it. http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter-test.cc File be/src/util/bloom-filter-test.cc: http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter-test.cc@34 PS22, Line 34: // kudu::BlockBloomFilter::kCpu is a static variable and is initialized once. : // To temporarily disable AVX2 for Bloom Filter in runtime testing, set flag : // disable_blockbloomfilter_avx2 as true. See kudu::BlockBloomFilter::has_avx2(). : // kudu::BlockBloomFilter::has_avx2() is called by kudu::BlockBloomFilter : // constructor and kudu::BlockBloomFilter::OrEqualArrayInternal(). Impala : // testing code should set this flag before creating impala::BloomFilter object : // which affect impala::BloomFilter::Insert() and impala::BloomFilter::Find(), : // or before calling impala::BloomFilter::or(). : // This flag has no effect if the target CPU doesn't support AVX2. > This comment is a lot longer than necessary. Probably fine to just say some will make it short. http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter.h File be/src/util/bloom-filter.h: http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter.h@40 PS22, Line 40: class BlockBloomFilter; : class BlockBloomFilterBufferAllocatorIf; : class Slice; > nit: I think currently as written none of these are needed, since Slice isn will remove these three lines. http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter.h@70 PS22, Line 70: class ImpalaBloomFilterBufferAllocator : public kudu::BlockBloomFilterBufferAllocatorIf { > I think this is now only referenced in bloom-filter.cc, so we can just move A buffer allocator is defined as regular member variable in BloomFilter class. If moving this class definition to bloom-filter.cc, we will get a compilation error which complains incomplete type for the allocator member variable. http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter.h@210 PS22, Line 210: * > nit: formatting will fix it http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter.h@213 PS22, Line 213: /// Buffer allocator is owned by the bloom filter. > This comment seems weird. Maybe say something like "used by kudu to allocat will fix it. http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter.h@215 PS22, Line 215: e > nit: 'E' Will fix it. http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter.cc File be/src/util/bloom-filter.cc: http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter.cc@35 PS22, Line 35: #include "util/kudu-status-util.h" > nit: please alphabetize Will fix it http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter.cc@49 PS22, Line 49: delete > In general, we try to avoid using 'delete' as its error prone. will change it as regular member variable. http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter.cc@52 PS22, Line 52: // Kudu only support FastHash for BlockBloomFilter. : // Since Fasthash is strictly better than Murmur Hash2, we do not : // support Murmur Hash2 algorithm for Bloom filter. > How about we move this information to the class header on BloomFilter in bl will move to header file. http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter.cc@57 PS22, Line 57: hash_seed_ > I don't think this is ever actually used, so it can probably just be remove will remove it. http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter.cc@176 PS22, Line 176: > nit: formating will fix it. http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/hash-util.h File be/src/util/hash-util.h: http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/hash-util.h@30 PS22, Line 30: enum HashAlgorithm { > I don't think this is used anywhere, so lets remove it. will remote it. http://gerrit.cloudera.org:8080/#/c/15683/22/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java: http://gerrit.cloudera.org:8080/#/c/15683/22/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@722 PS22, Line 722: SlotRef slotRef = targetExpr.unwrapSlotRef(true); > This is unnecessary - what its doing is removing any possible casts around will fix it as suggested. http://gerrit.cloudera.org:8080/#/c/15683/22/fe/src/test/java/org/apache/impala/planner/PlannerTest.java File fe/src/test/java/org/apache/impala/planner/PlannerTest.java: http://gerrit.cloudera.org:8080/#/c/15683/22/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@596 PS22, Line 596: public void testKudu() { > I think it would be good to add some planner tests for the interaction betw Will add a new test file bloom-filter-assignment.test for the new test cases. http://gerrit.cloudera.org:8080/#/c/15683/22/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@635 PS22, Line 635: public void testKuduTpch() { > Might be nice to set explainLevel = EXTENDED on at least one of these tests will add query option with explain_level as extended for a query with both bloom filter and min-max filter. http://gerrit.cloudera.org:8080/#/c/15683/22/testdata/workloads/functional-query/queries/QueryTest/all_runtime_filters.test File testdata/workloads/functional-query/queries/QueryTest/all_runtime_filters.test: http://gerrit.cloudera.org:8080/#/c/15683/22/testdata/workloads/functional-query/queries/QueryTest/all_runtime_filters.test@10 PS22, Line 10: 29200 > Is it possible to add PROFILE sections for these first test cases like the Will add profile sections with aggregation over ProbeRows and row_regex for the "n of n Runtime Filters Published". http://gerrit.cloudera.org:8080/#/c/15683/22/testdata/workloads/functional-query/queries/QueryTest/all_runtime_filters.test@396 PS22, Line 396: # Test case 4: Two StringMinMaxFilters generated in the same fragment (IMPALA-7272) > I think this test case can probably be removed (the memory allocation in Mi will remove it. http://gerrit.cloudera.org:8080/#/c/15683/22/testdata/workloads/functional-query/queries/QueryTest/all_runtime_filters.test@423 PS22, Line 423: # Test case 5: SEMI_JOIN > Add a comment saying what's interesting about this test case? It's suggested by Aman to add this test case. Will add comment. -- To view, visit http://gerrit.cloudera.org:8080/15683 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9100076f68ea299ddb6ec8bc027cac7a47f5d754 Gerrit-Change-Number: 15683 Gerrit-PatchSet: 22 Gerrit-Owner: Wenzhe Zhou <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Bankim Bhavsar <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Jim Apple <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Comment-Date: Wed, 03 Jun 2020 13:11:51 +0000 Gerrit-HasComments: Yes
