Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10983 )

Change subject: KUDU-1291. Efficiently support predicates on non-prefix key 
components
......................................................................


Patch Set 10:

(14 comments)

I've only been through a few files, but here are some nits so far. Seems there 
are a few open points from Mike too.

http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/cfile/cfile_reader.h
File src/kudu/cfile/cfile_reader.h:

http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/cfile/cfile_reader.h@59
PS10, Line 59: static const int kPrimaryKeyMaxSize = 16*1024;
nit: is this a hard limit or just a target? How is this enforced?


http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/cfile/cfile_reader.h@339
PS10, Line 339: to store the current value
nit: note that the "current value" refers to the value after seeking.


http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/cfile/cfile_reader.h@470
PS10, Line 470: Status SetCurrentValue();
              :
super-nit: in reading the name SetX(), based on other usages in the codebase 
(grep -r "Set\w" in src/kudu), I kind of expect the contract of some SetX(x) to 
set some internal variable to x. Mind changing this to something like 
StoreCurrentValue or CacheCurrentValue?


http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/common/encoded_key.h
File src/kudu/common/encoded_key.h:

http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/common/encoded_key.h@63
PS10, Line 63:  int num_columns = 0
nit: what does this do?


http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/common/partial_row.h
File src/kudu/common/partial_row.h:

http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/common/partial_row.h@52
PS10, Line 52: template<typename KeyTypeWrapper> struct SliceTypeRowOps; // 
IWYU pragma: keep
             :   template<typename KeyTypeWrapper> struct NumTypeRowOps;   // 
IWYU pragma: keep
nit: not your fault, but mind removing the spaces here?


http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.h
File src/kudu/tablet/cfile_set.h:

http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.h@207
PS10, Line 207: Arena* arena
nit: note that (I think) arena stores the buffer containing with the encoded key


http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.h@207
PS10, Line 207: gscoped_ptr
nit: if the change would be contained within cfile_set.cc, mind updating the 
newly added occurrences of gscoped_ptr to unique_ptr?


http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.h@209
PS10, Line 209: prefix_key
nit: Since this isn't a variable name, maybe take out the underscore.


http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.h@210
PS10, Line 210: "num_cols"
nit: 'num_prefix_cols'


http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.h@210
PS10, Line 210:   // "prefix_key" refers to the first "num_cols" columns of the 
current key
              :   // (current key is the key currently pointed to by 
validx_iter_).
              :   // 'cache_seeked_value' controls whether we will remember the 
key seeked to
              :   // in the "current value" of the iterator.
nit: can you reword this to clarify whether "current" refers to before or after 
the seek?


http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.cc
File src/kudu/tablet/cfile_set.cc:

http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.cc@413
PS10, Line 413: !(lower_bound_idx_ == 0 && upper_bound_idx_ == row_count_)
nit: IMO slightly easier to read as (lower_bound_idx_ != 0 || upper_bound_idx 
!= row_count_)


http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.cc@423
PS10, Line 423: const std::unordered_map<std::string, ColumnPredicate> 
&predicates = spec.predicates();
              :   for (auto it=predicates.begin(); it!=predicates.end(); ++it) {
Merge these into something like:

 for (const auto& col_and_pred: predicates) {
   const string& col_name = col_and_pred.first;
   const ColumnPredicate& pred = col_and_pred.second;
   const Schema& schema = base_data_->tablet_schema();
   // ... use `col_name`, `pred`, and `schema`
 }


http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.cc@441
PS10, Line 441: skip_scan_predicate_column_id_ = col_id;
              :
              :         // Store the predicate value.
              :         skip_scan_predicate_value_ = (it->second).raw_lower();
I think it's generally good hygiene that these don't get modified until after 
we know we're enable skip-scanning, like we do for `use_skip_scan_`. Though 
maybe it doesn't matter if these aren't going to be used by the iterator anyway?


http://gerrit.cloudera.org:8080/#/c/10983/10/src/kudu/tablet/cfile_set.cc@509
PS10, Line 509:   TryEnableSkipScan(*spec);
nit: I know it doesn't make a difference execution-wise, but I think this makes 
more sense up by Init() at L394



--
To view, visit http://gerrit.cloudera.org:8080/10983
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f
Gerrit-Change-Number: 10983
Gerrit-PatchSet: 10
Gerrit-Owner: Anupama Gupta <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Anupama Gupta <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Tue, 07 Aug 2018 05:21:18 +0000
Gerrit-HasComments: Yes

Reply via email to