Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10983 )

Change subject: Kudu-1291 Efficiently support predicates on non-prefix key 
components
......................................................................


Patch Set 4:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/cfile/cfile_reader.cc
File src/kudu/cfile/cfile_reader.cc:

http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/cfile/cfile_reader.cc@793
PS3, Line 793:
> Yes, this works for string and binary values stored as uint8_t byte array.
What happens if the cfile is storing something other than the ad-hoc index? 
This is not in general safe, we should put in additional safeguards if this is 
how we want to do this, but I am a little skeptical about the approach, need to 
think about it a bit more.


http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/cfile/cfile_reader.cc
File src/kudu/cfile/cfile_reader.cc:

http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/cfile/cfile_reader.cc@785
PS4, Line 785:   Slice ret;
add: DCHECK_EQ(nullptr, validx_iter_);

Also, are we sure that the value index can only ever be a string?


http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/common/encoded_key.h
File src/kudu/common/encoded_key.h:

http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/common/encoded_key.h@63
PS3, Line 63: um_columns = 0)
> Done
I don't see num_columns documented in the header file comment in rev 4


http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h
File src/kudu/tablet/cfile_set.h:

http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h@204
PS4, Line 204:   // This function is used to place the validx_iter_ at the next 
greater "prefix_key".
nit: you should try to maintain the same order in the header file as the .cc 
file per the style guide


http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h@283
PS4, Line 283: skip_scan_enabled_
rename to use_skip_scan_ because this is confusingly similar to the gflag name 
FLAGS_enable_skip_scan


http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h@286
PS4, Line 286: suffix_key_pred_value_
rename to skip_scan_predicate_value_


http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h@289
PS4, Line 289: suffix_key_column_id_
rename to skip_scan_predicate_column_id_


http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h@291
PS4, Line 291:   // Row id of the last entry of "suffix_key" predicate value 
wrt to the
             :   // "prefix_key" currently pointed to by validx_iter_
use something like this as the description:

  // Row id of the next row that does not match the predicate.
  // This is an exclusive upper bound on our scan range.
  // A value of -1 indicates that the upper bound is not known.


http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h@293
PS4, Line 293: suffix_key_upper_bound_idx_
rename to skip_scan_upper_bound_idx_


http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h@293
PS4, Line 293: int
make this an int64_t


http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.cc
File src/kudu/tablet/cfile_set.cc:

http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.cc@538
PS4, Line 538:   if (skip_scan_enabled_ && (static_cast<int>(cur_idx_) > 
suffix_key_upper_bound_idx_)) {
Can you break this out into its own function?


http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.cc@540
PS4, Line 540: bool suffix_key_match, continue_seeking_next_prefix = false;
nit: declare these variables separately


http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.cc@542
PS4, Line 542:     Status next_entry_match, suffix_key_upper_bound_match;
Can we avoid maintaining so much state outside the loop?


http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.cc@602
PS4, Line 602: continue_seeking_next_prefix  = true;
not indented



--
To view, visit http://gerrit.cloudera.org:8080/10983
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f
Gerrit-Change-Number: 10983
Gerrit-PatchSet: 4
Gerrit-Owner: Anupama Gupta <[email protected]>
Gerrit-Reviewer: Anupama Gupta <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Tue, 31 Jul 2018 23:17:11 +0000
Gerrit-HasComments: Yes

Reply via email to