Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/10983 )
Change subject: Kudu-1291 Efficiently support predicates on non-prefix key components ...................................................................... Patch Set 4: (14 comments) http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/cfile/cfile_reader.cc File src/kudu/cfile/cfile_reader.cc: http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/cfile/cfile_reader.cc@793 PS3, Line 793: > Yes, this works for string and binary values stored as uint8_t byte array. What happens if the cfile is storing something other than the ad-hoc index? This is not in general safe, we should put in additional safeguards if this is how we want to do this, but I am a little skeptical about the approach, need to think about it a bit more. http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/cfile/cfile_reader.cc File src/kudu/cfile/cfile_reader.cc: http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/cfile/cfile_reader.cc@785 PS4, Line 785: Slice ret; add: DCHECK_EQ(nullptr, validx_iter_); Also, are we sure that the value index can only ever be a string? http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/common/encoded_key.h File src/kudu/common/encoded_key.h: http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/common/encoded_key.h@63 PS3, Line 63: um_columns = 0) > Done I don't see num_columns documented in the header file comment in rev 4 http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h File src/kudu/tablet/cfile_set.h: http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h@204 PS4, Line 204: // This function is used to place the validx_iter_ at the next greater "prefix_key". nit: you should try to maintain the same order in the header file as the .cc file per the style guide http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h@283 PS4, Line 283: skip_scan_enabled_ rename to use_skip_scan_ because this is confusingly similar to the gflag name FLAGS_enable_skip_scan http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h@286 PS4, Line 286: suffix_key_pred_value_ rename to skip_scan_predicate_value_ http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h@289 PS4, Line 289: suffix_key_column_id_ rename to skip_scan_predicate_column_id_ http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h@291 PS4, Line 291: // Row id of the last entry of "suffix_key" predicate value wrt to the : // "prefix_key" currently pointed to by validx_iter_ use something like this as the description: // Row id of the next row that does not match the predicate. // This is an exclusive upper bound on our scan range. // A value of -1 indicates that the upper bound is not known. http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h@293 PS4, Line 293: suffix_key_upper_bound_idx_ rename to skip_scan_upper_bound_idx_ http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.h@293 PS4, Line 293: int make this an int64_t http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.cc File src/kudu/tablet/cfile_set.cc: http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.cc@538 PS4, Line 538: if (skip_scan_enabled_ && (static_cast<int>(cur_idx_) > suffix_key_upper_bound_idx_)) { Can you break this out into its own function? http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.cc@540 PS4, Line 540: bool suffix_key_match, continue_seeking_next_prefix = false; nit: declare these variables separately http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.cc@542 PS4, Line 542: Status next_entry_match, suffix_key_upper_bound_match; Can we avoid maintaining so much state outside the loop? http://gerrit.cloudera.org:8080/#/c/10983/4/src/kudu/tablet/cfile_set.cc@602 PS4, Line 602: continue_seeking_next_prefix = true; not indented -- To view, visit http://gerrit.cloudera.org:8080/10983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f Gerrit-Change-Number: 10983 Gerrit-PatchSet: 4 Gerrit-Owner: Anupama Gupta <[email protected]> Gerrit-Reviewer: Anupama Gupta <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Reviewer: Tidy Bot Gerrit-Comment-Date: Tue, 31 Jul 2018 23:17:11 +0000 Gerrit-HasComments: Yes
