Todd Lipcon has posted comments on this change.

Change subject: Predicate evaluation pushdown
......................................................................


Patch Set 7:

(31 comments)

http://gerrit.cloudera.org:8080/#/c/3990/7/src/kudu/cfile/binary_plain_block.cc
File src/kudu/cfile/binary_plain_block.cc:

Line 306: Status BinaryPlainBlockDecoder::CopyNextValues(size_t *n, 
ColumnDataView *dst) {
I wonder whether this function could share code with the new one using some 
templating (to force that they're separately instantiated)


http://gerrit.cloudera.org:8080/#/c/3990/7/src/kudu/cfile/binary_prefix_block.h
File src/kudu/cfile/binary_prefix_block.h:

Line 99:     cur_idx_ += *n;
surprised this doesn't need to update cur_val_ and next_ptr_


http://gerrit.cloudera.org:8080/#/c/3990/7/src/kudu/cfile/cfile_reader.cc
File src/kudu/cfile/cfile_reader.cc:

Line 488:   ctx->SetDecoderEvalNotSupported();
shouldn't this be done in PrepareEvalContext?


PS7, Line 489: ctx->block()
given the number of times you use ctx->block() in this function I think it's 
worth just doing: ColumnBlock* dst = ctx->block()


http://gerrit.cloudera.org:8080/#/c/3990/7/src/kudu/cfile/cfile_reader.h
File src/kudu/cfile/cfile_reader.h:

Line 355:   // Sets context if needed. This must happen before a call to
no need for this line (since it's already described in the interface above)


PS7, Line 390: ;
nit: space after ;


PS7, Line 463: or>cod
nit: space after >


Line 480:   // Predicate and selection vector associated with the iterator
would this ever be null? worth a comment explaining when it might be


http://gerrit.cloudera.org:8080/#/c/3990/7/src/kudu/cfile/rle_block.h
File src/kudu/cfile/rle_block.h:

Line 190:   void SeekForward(size_t *n) override {
hrm, I'm skeptical of this function, I'd think it would need to update the 
rle_decoder_ as well.

Maybe update encoding-test so that this is covered?


Line 413:   void SeekForward(size_t *n) override {
same


http://gerrit.cloudera.org:8080/#/c/3990/7/src/kudu/common/column_eval_context.h
File src/kudu/common/column_eval_context.h:

Line 26: // get passed down to the decoders during predicate push-down.
this is now more generally the set of objects passed down to the column 
iterator and decoder during column materialization, right?


Line 31: public:
nit: indent one


Line 43:   // Column index in the parent CFileSet.
I think this is incorrect and it's actually the index within the projection 
schema, rather than within the CFileSet. Perhaps we should call it 
projection_idx or something to clarify?


Line 51:   // I.e. BinaryDictBlockDecoder.CopyNextAndEval(), 
CFileIterator.Scan()
not quite following this comment. when is this used outside of scans?


Line 54:   // Selection vector reflecting the result of the predicate 
evaluation.
should clarify whether this is an "out" or an "in-out" type parameter.

In other words, if the scanner sees that a column is already non-selected in 
here, is it free to skip reading that row? I'm guessing so, but good to clarify.


Line 57:   // Must be greater than size 0 and must be large enough to cover the 
number
isn't "greater than size 0" redundant with "large enough to cover the number of 
rows"?


Line 63:   // materializing_iterator_decoder_eval to false, forcing evaluation 
to fall
I think this comment needs an update


Line 74:     if (decoder_eval_status_ == kNotSet) {
it seems weird that this function is a silent no-op in the case that it's 
already been set. If it's set in a "conflicting" manner, what happens? seems 
like it should either be a DCHECK, or by a 'TrySetDecoderEvalSupported' type 
function that returns a bool about whether it successfully changed modes or not.


Line 79:   void SetDecoderEvalNotSupported() {
particularly seems like here it should be a CHECK that it's kNotSet. Otherwise 
you might try to switch from supported to non-supported, and only partially 
have filled in the selection vector


http://gerrit.cloudera.org:8080/#/c/3990/7/src/kudu/common/generic_iterators.cc
File src/kudu/common/generic_iterators.cc:

Line 26: #include "kudu/common/column_eval_context.h"
nit: sort the includes


PS7, Line 522:  = ColumnEvalContext
can just be:

ColumnEvalContext ctx(...)

instead of the extra temporary construction and assignment


Also, given that we now use ColumnEvalContext for all materialization, and not 
just the one with predicate evaluation, we should rename to 
ColumnMaterializationContext


Line 526:     if (ctx.pred()->predicate_type() == PredicateType::None) {
does a None predicate ever make it this far into the code? I would think that 
the None would be handled way up above this layer (and this could be a DCHECK)


Line 549:     ColumnEvalContext ctx = ColumnEvalContext(col_idx,
same


Line 553:     ctx.SetDecoderEvalNotSupported();
why's this necessary? shouldn't the nullptr predicate be sufficient for the 
code to know it doesn't need to evaluate any predicate?


http://gerrit.cloudera.org:8080/#/c/3990/7/src/kudu/common/generic_iterators.h
File src/kudu/common/generic_iterators.h:

Line 28: #include "kudu/common/column_predicate.h"
this new include isn't used in this header file (include it in the .cc if it's 
only used in the .cc)


http://gerrit.cloudera.org:8080/#/c/3990/7/src/kudu/common/iterator.h
File src/kudu/common/iterator.h:

Line 97:   // col_idx is within the projection schema, not the underlying 
schema.
this comment probably needs to move to the context's constructor (since 
'col_idx' is no longer a parameter here)


http://gerrit.cloudera.org:8080/#/c/3990/7/src/kudu/common/rowblock.h
File src/kudu/common/rowblock.h:

PS7, Line 121:  The 
nit: "A SelectionVectorView" (rather than 'the', because there are probably 
many such instances)


http://gerrit.cloudera.org:8080/#/c/3990/7/src/kudu/common/types.h
File src/kudu/common/types.h:

Line 399
nit: removed a blank line here


http://gerrit.cloudera.org:8080/#/c/3990/7/src/kudu/tablet/cfile_set-test.cc
File src/kudu/tablet/cfile_set-test.cc:

Line 112:     ctx.SetDecoderEvalNotSupported();
same comment here as elsewhere -- if there's no predicate we shouldn't have to 
do this, right?


Line 115:  private:
nit: add blank line before 'private:'


http://gerrit.cloudera.org:8080/#/c/3990/7/src/kudu/tablet/tablet-test-util.h
File src/kudu/tablet/tablet-test-util.h:

Line 146:                                                int limit = INT_MAX) {
'limit' is never used, right?


-- 
To view, visit http://gerrit.cloudera.org:8080/3990
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I31e4cce21e99f63b089d7c84410af8ed914cb576
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <andrew.w...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <d...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-HasComments: Yes

Reply via email to