Dan Burkert has posted comments on this change.

Change subject: Predicate evaluation pushdown

Patch Set 2:


Overall I had a few style nits that were repeated throughout the patch:

* Put the pointer * and reference & next to the type, not next to the variable. 
 We have been consistent about this recently in Kudu, but since the column 
decoding is some of the oldest code we have some mixed styles.  Best to stick 
to next to the type going forward.

* Terminate all comments with a period, and wrap at 80 if possible or 100 max.

* User override instead of virtual + OVERRIDE

File src/kudu/cfile/binary_dict_block.cc:

Line 259:     // TODO: implement CopyNextAndEval for all blocks
Is this still relevant?  I think you have implemented CopyNextAndEval for plain 
binary blocks.

Line 264:   // None predicates or unsatisfied predicates should return no data
terminate all comments with a period.

File src/kudu/cfile/binary_dict_block.h:

Line 140:   virtual Status SeekForward(size_t* n) OVERRIDE {
drop 'virtual' and replace OVERRIDE with override, here and above.

File src/kudu/cfile/block_encodings.h:

PS2, Line 150: Fetch the next n values from the block
I think this should be up to n, right?  That's at least the contract of 

File src/kudu/cfile/cfile_reader.cc:

Line 728:     // No ctx_ defined implies there is no predicate evaluation at 
the decoder level, so CopyNextAndEval will not
wrap comments at 80 columns if possible (100 max).

File src/kudu/cfile/cfile_reader.h:

PS2, Line 385: prdicate

Line 387:   SelectionVector* GetPredicateSet() { return pred_set_.get();}
This seems like a strange place to put this method; why isn't it done by the 
BinaryDictBlockDecoder itself?

Line 480: 
extraneous line

File src/kudu/common/column_eval_context.h:

Going forward, we are using '#pragma once' instead of ifdefs for new files, see 

Line 17:                     bool &eval_complete) :
Style, see: 

PS2, Line 17: bool &
The style guide forbids non-const reference args and fields.  This is a little 
confusing, though.  Why is it taking a boolean by reference instead of by value?

PS2, Line 24:  &
ampersand should be with the type, same with the pointer * below.  I think 
technically the style guide waffles on this, but I think we more commonly do & 
and * with the type.  

File src/kudu/common/column_predicate.cc:

Line 270: bool ColumnPredicate::EvaluateCell(const void *cell) const {
move this above the anonymous namespace above, the function defined in it is 
tied to Evaluate.

Line 277:   else if (predicate_type() == PredicateType::IsNotNull) {
The IsNotNull and None branches shouldn't be necessary given the prior check.

File src/kudu/common/column_predicate.h:

Line 130:   bool EvaluateCell(const void *cell) const;
I think this is going to be really slow, because it forces a bunch of branches 
to happen per cell, e.g. figuring out what type the cell is and what type the 
predicate is.  I would have expected the performance on plain binary block 
decoding to get significantly slower because of this, most acutely with a 
low-selectivity predicate.

There is probably a better way to do this.  Perhaps we can figure out a way to 
have the column predicate return a fn-pointer that has already evaluated these 
branches?  Evaluate() internally does something like this, but it's obviously 
not exposed in the interface.

To view, visit http://gerrit.cloudera.org:8080/3990
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I31e4cce21e99f63b089d7c84410af8ed914cb576
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <andrew.w...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <d...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: Yes

Reply via email to