Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16036 )

Change subject: WIP [perf] KUDU-3140 Heuristics to disable predicate evaluation 
for Bloom filter
......................................................................


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/16036/2/src/kudu/common/generic_iterators.cc
File src/kudu/common/generic_iterators.cc:

http://gerrit.cloudera.org:8080/#/c/16036/2/src/kudu/common/generic_iterators.cc@1290
PS2, Line 1290: num_rows_before
Shouldn't we be measuring this before the call to MaterializeColumn(), which 
might evaluate some predicates? Otherwise, wouldn't dictionary/RLE encoded 
cells not be accounted for, even though they still consume CPU to evaluate the 
BF predicate?


http://gerrit.cloudera.org:8080/#/c/16036/2/src/kudu/common/predicate_effectiveness.h
File src/kudu/common/predicate_effectiveness.h:

http://gerrit.cloudera.org:8080/#/c/16036/2/src/kudu/common/predicate_effectiveness.h@30
PS2, Line 30: // Per column predicate effectiveness context.
            : // Though all column predicate types are tracked, effectiveness 
of only Bloom filter predicate
            : // is currently enforced.
I don't agree with making this generic for all predicates. It seems like 
effectiveness/disabling of a predicate only makes sense in the context of a 
bloom filter because applications that use bloom filters typically accept false 
positives. That isn't the case for any other kind of predicate, where exact 
evaluation of the predicate is required. That makes me think this should be a 
bit more specialized.

Did you consider making the IteratorPredicateEffectivenessContext only track 
"disableable" predicates like bloom filters? eg by storing a {predicate_idx => 
predicate context} map instead of a vector, or storing a vector of 
is_disableable bools?


http://gerrit.cloudera.org:8080/#/c/16036/2/src/kudu/common/predicate_effectiveness.h@38
PS2, Line 38:  : type(type), rows_read(0),
            :                                                                
rows_rejected(0), enabled(true) {
micro-nit: would prefer putting these on a new line (at least I tend to favor 
narrower code where convenient).


http://gerrit.cloudera.org:8080/#/c/16036/2/src/kudu/common/predicate_effectiveness.h@83
PS2, Line 83: CheckEffectiveness
nit: it isn't really clear from this naming that this also disables the 
predicates. Maybe rename to DisableIneffectivePredicates() or somesuch?

Also, why bother returning the disabled predicates? Seems like they're just 
used for logging -- can we push the logging into this class, eg by storing a 
pointer to the predicate in the effectiveness context instead of just the type?


http://gerrit.cloudera.org:8080/#/c/16036/2/src/kudu/common/predicate_effectiveness.h@88
PS2, Line 88: co-ordinate
nit: coordinate



--
To view, visit http://gerrit.cloudera.org:8080/16036
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I10197800a01a1b34c7821ac879caf8d272cab8dd
Gerrit-Change-Number: 16036
Gerrit-PatchSet: 2
Gerrit-Owner: Bankim Bhavsar <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 10 Jun 2020 02:12:41 +0000
Gerrit-HasComments: Yes

Reply via email to