Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15683 )

Change subject: IMPALA-3741 [part 2]: Push runtime bloom filter to Kudu
......................................................................


Patch Set 22:

(23 comments)

http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/codegen/gen_ir_descriptions.py
File be/src/codegen/gen_ir_descriptions.py:

http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/codegen/gen_ir_descriptions.py@204
PS22, Line 204: ["RAW_VALUE_GET_HASH_VALUE",
> I don't think this is being used anywhere anymore, so it can be removed.
Right, will remove this entry.


http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/exec/kudu-scanner.cc
File be/src/exec/kudu-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/exec/kudu-scanner.cc@247
PS22, Line 247: NULL
> nit: we prefer nullptr, here and elsewhere
Will fix it.


http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/kudu/util/block_bloom_filter.h
File be/src/kudu/util/block_bloom_filter.h:

http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/kudu/util/block_bloom_filter.h@146
PS22, Line 146:     // It's possible that log_heap_size is less than 
kLogBucketWordBits in unit test.
> We definitely want to avoid making any changes to the be/src/kudu directory
Will change bloom-filter-test.cc to avoid hitting this error.


http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/runtime/raw-value.h
File be/src/runtime/raw-value.h:

http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/runtime/raw-value.h@86
PS22, Line 86: GetHashValueFastHash32
> I think you mean 'GetHashValueFastHash', i.e. this function doesn't return
Right, fix it.


http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/runtime/raw-value.h@93
PS22, Line 93: GetHashValueFastHash32NonNull
> Same here - this doesn't return a 32-bit value, so I think you meant to say
will fix it.


http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter-test.cc
File be/src/util/bloom-filter-test.cc:

http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter-test.cc@34
PS22, Line 34: // kudu::BlockBloomFilter::kCpu is a static variable and is 
initialized once.
             : // To temporarily disable AVX2 for Bloom Filter in runtime 
testing, set flag
             : // disable_blockbloomfilter_avx2 as true. See 
kudu::BlockBloomFilter::has_avx2().
             : // kudu::BlockBloomFilter::has_avx2() is called by 
kudu::BlockBloomFilter
             : // constructor and 
kudu::BlockBloomFilter::OrEqualArrayInternal(). Impala
             : // testing code should set this flag before creating 
impala::BloomFilter object
             : // which affect impala::BloomFilter::Insert() and 
impala::BloomFilter::Find(),
             : // or before calling impala::BloomFilter::or().
             : // This flag has no effect if the target CPU doesn't support 
AVX2.
> This comment is a lot longer than necessary. Probably fine to just say some
will make it short.


http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter.h
File be/src/util/bloom-filter.h:

http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter.h@40
PS22, Line 40: class BlockBloomFilter;
             : class BlockBloomFilterBufferAllocatorIf;
             : class Slice;
> nit: I think currently as written none of these are needed, since Slice isn
will remove these three lines.


http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter.h@70
PS22, Line 70: class ImpalaBloomFilterBufferAllocator : public 
kudu::BlockBloomFilterBufferAllocatorIf {
> I think this is now only referenced in bloom-filter.cc, so we can just move
A buffer allocator is defined as regular member variable in BloomFilter class. 
If moving this class definition to bloom-filter.cc, we will get a compilation 
error which complains  incomplete type for the allocator member variable.


http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter.h@210
PS22, Line 210:  *
> nit: formatting
will fix it


http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter.h@213
PS22, Line 213: /// Buffer allocator is owned by the bloom filter.
> This comment seems weird. Maybe say something like "used by kudu to allocat
will fix it.


http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter.h@215
PS22, Line 215: e
> nit: 'E'
Will fix it.


http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter.cc
File be/src/util/bloom-filter.cc:

http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter.cc@35
PS22, Line 35: #include "util/kudu-status-util.h"
> nit: please alphabetize
Will fix it


http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter.cc@49
PS22, Line 49: delete
> In general, we try to avoid using 'delete' as its error prone.
will change it as regular member variable.


http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter.cc@52
PS22, Line 52: // Kudu only support FastHash for BlockBloomFilter.
             : // Since Fasthash is strictly better than Murmur Hash2, we do not
             : // support Murmur Hash2 algorithm for Bloom filter.
> How about we move this information to the class header on BloomFilter in bl
will move to header file.


http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter.cc@57
PS22, Line 57: hash_seed_
> I don't think this is ever actually used, so it can probably just be remove
will remove it.


http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/bloom-filter.cc@176
PS22, Line 176:
> nit: formating
will fix it.


http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/hash-util.h
File be/src/util/hash-util.h:

http://gerrit.cloudera.org:8080/#/c/15683/22/be/src/util/hash-util.h@30
PS22, Line 30: enum HashAlgorithm {
> I don't think this is used anywhere, so lets remove it.
will remote it.


http://gerrit.cloudera.org:8080/#/c/15683/22/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java:

http://gerrit.cloudera.org:8080/#/c/15683/22/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@722
PS22, Line 722: SlotRef slotRef = targetExpr.unwrapSlotRef(true);
> This is unnecessary - what its doing is removing any possible casts around
will fix it as suggested.


http://gerrit.cloudera.org:8080/#/c/15683/22/fe/src/test/java/org/apache/impala/planner/PlannerTest.java
File fe/src/test/java/org/apache/impala/planner/PlannerTest.java:

http://gerrit.cloudera.org:8080/#/c/15683/22/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@596
PS22, Line 596:   public void testKudu() {
> I think it would be good to add some planner tests for the interaction betw
Will add a new test file bloom-filter-assignment.test for the new test cases.


http://gerrit.cloudera.org:8080/#/c/15683/22/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@635
PS22, Line 635:   public void testKuduTpch() {
> Might be nice to set explainLevel = EXTENDED on at least one of these tests
will add query option with explain_level as extended for a query with both 
bloom filter and min-max filter.


http://gerrit.cloudera.org:8080/#/c/15683/22/testdata/workloads/functional-query/queries/QueryTest/all_runtime_filters.test
File 
testdata/workloads/functional-query/queries/QueryTest/all_runtime_filters.test:

http://gerrit.cloudera.org:8080/#/c/15683/22/testdata/workloads/functional-query/queries/QueryTest/all_runtime_filters.test@10
PS22, Line 10: 29200
> Is it possible to add PROFILE sections for these first test cases like the
Will add profile sections with aggregation over ProbeRows and row_regex for the 
"n of n Runtime Filters Published".


http://gerrit.cloudera.org:8080/#/c/15683/22/testdata/workloads/functional-query/queries/QueryTest/all_runtime_filters.test@396
PS22, Line 396: # Test case 4: Two StringMinMaxFilters generated in the same 
fragment (IMPALA-7272)
> I think this test case can probably be removed (the memory allocation in Mi
will remove it.


http://gerrit.cloudera.org:8080/#/c/15683/22/testdata/workloads/functional-query/queries/QueryTest/all_runtime_filters.test@423
PS22, Line 423: # Test case 5: SEMI_JOIN
> Add a comment saying what's interesting about this test case?
It's suggested by Aman to add this test case. Will add comment.



--
To view, visit http://gerrit.cloudera.org:8080/15683
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9100076f68ea299ddb6ec8bc027cac7a47f5d754
Gerrit-Change-Number: 15683
Gerrit-PatchSet: 22
Gerrit-Owner: Wenzhe Zhou <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Bankim Bhavsar <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Jim Apple <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Wenzhe Zhou <[email protected]>
Gerrit-Comment-Date: Wed, 03 Jun 2020 13:11:51 +0000
Gerrit-HasComments: Yes

Reply via email to