Anupama Gupta has posted comments on this change. ( http://gerrit.cloudera.org:8080/10983 )
Change subject: Kudu-1291 Efficiently support predicates on non-prefix key components ...................................................................... Patch Set 4: (41 comments) Thanks for the helpful comments Mike. Please review the changes. http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/cfile/cfile_reader.cc File src/kudu/cfile/cfile_reader.cc: http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/cfile/cfile_reader.cc@793 PS3, Line 793: > This only works if the cell is a pointer, like in the case of a string valu Yes, this works for string and binary values stored as uint8_t byte array. The ad-hoc index stores the keys that are either encoded as binary prefix blocks or binary dict blocks. http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/common/encoded_key.h File src/kudu/common/encoded_key.h: http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/common/encoded_key.h@63 PS3, Line 63: um_columns = 0) > document this parameter in the comment Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/common/encoded_key.cc File src/kudu/common/encoded_key.cc: http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/common/encoded_key.cc@111 PS3, Line 111: num_columns, arena)) { > I thought we were going to generalize this to any column in the compound PK Right, I have updated this now. http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/common/partial_row.h File src/kudu/common/partial_row.h: http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/common/partial_row.h@494 PS3, Line 494: rivate: > This class is a part of the public C++ client API (see KUDU_EXPORT) so we w Thanks, it makes sense. I have now added the calling class (CfileSet) as a friend class of KuduPartialRow. http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/common/partial_row.h@498 PS3, Line 498: friend class RowOpe > Why is this private field now public? Should not be necessary. This field is used in the function - CFileSet::Iterator::PrepareBatch , to set the minimum values for some columns. I am not quite sure if there is a better alternative to this. http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/common/wire_protocol-test-util.h File src/kudu/common/wire_protocol-test-util.h: http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/common/wire_protocol-test-util.h@38 PS3, Line 38: inline void AddTestRowWithNullableStringToPB(RowOperationsPB::Type op_type, > move this to the test where it's being used I have removed this now. http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.h File src/kudu/tablet/cfile_set.h: http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.h@203 PS3, Line 203: : // This function is used to place the va > These methods need doc comments Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.h@230 PS3, Line 230: initted_(false), > add documentation for this method Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.h@264 PS3, Line 264: size_t cur_idx_; : size_t prepared_count_; > these variables should be documented with a comment Done http://gerrit.cloudera.org:8080/#/c/10983/1/src/kudu/tablet/cfile_set.cc File src/kudu/tablet/cfile_set.cc: http://gerrit.cloudera.org:8080/#/c/10983/1/src/kudu/tablet/cfile_set.cc@669 PS1, Line 669: > instead of this loop, I wonder if we can instead "peek" at the last value s Makes sense. But I am not quite sure if we can "peek" at the last seeked ordinal in the ad-hoc index column here. http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc File src/kudu/tablet/cfile_set.cc: http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc@395 PS3, Line 395: tate. > let's rename this CanUseSkipScan(), have it return a boolean, and set Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc@397 PS3, Line 397: } > Add a flag DEFINE_bool(enable_skip_scan, true, "...") that allows us to tur Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc@400 PS3, Line 400: int num_key_cols = base_data_->tablet_schema().num_key_columns(); > looks like a compile error? probably some unstaged changes Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc@415 PS3, Line 415: for (auto it=predicates.begin(); it!=predicates.end(); ++it) { > error: use of undeclared identifier 'suffix_key_id_' [clang-diagnostic-erro Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc@418 PS3, Line 418: string col_name = (it->second).column().name(); > error: use of undeclared identifier 'suffix_key_id_' [clang-diagnostic-erro Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc@429 PS3, Line 429: nonfirst_key_pred_exists = true; > error: use of undeclared identifier 'suffix_key_id_'; did you mean 'suffix_ Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc@511 PS3, Line 511: Arena arena2(1024); > The following huge block of code doesn't have any comments in it... mind ad Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc@515 PS3, Line 515: Slice(key_iter_->GetCurrentValue()), &cur_enc_key); > error: use of undeclared identifier 'suffix_upper_bound_index_' [clang-diag Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc@524 PS3, Line 524: r:: > use && not 'and' Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc@544 PS3, Line 544: do { > error: use of undeclared identifier 'suffix_key_id_' [clang-diagnostic-erro Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc@571 PS3, Line 571: > error: use of undeclared identifier 'suffix_key_id_' [clang-diagnostic-erro Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc@572 PS3, Line 572: // Get the new encoded key > error: use of undeclared identifier 'suffix_key_id_' [clang-diagnostic-erro Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc@588 PS3, Line 588: > error: use of undeclared identifier 'suffix_key_id_' [clang-diagnostic-erro Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc@593 PS3, Line 593: if (suffix_key_match) { > error: use of undeclared identifier 'suffix_upper_bound_index_' [clang-diag Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc@613 PS3, Line 613: } else { > error: use of undeclared identifier 'suffix_upper_bound_index_' [clang-diag Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc@696 PS3, Line 696: // Convert current key slice to encoded key > error: use of undeclared identifier 'suffix_key_id_'; did you mean 'suffix_ Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc@720 PS3, Line 720: int col_id = 0; > error: use of undeclared identifier 'suffix_key_id_' [clang-diagnostic-erro Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/cfile_set.cc@730 PS3, Line 730: break; > warning: do not use 'else' after 'break' [readability-else-after-return] Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/index_skipscan-test-2.cc File src/kudu/tablet/index_skipscan-test-2.cc: PS3: > don't name this file -2 ... also this test is not included in tablet/CMakeL Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/index_skipscan-test-2.cc@57 PS3, Line 57: > warning: using decl 'InternalMiniCluster' is unused [misc-unused-using-decl Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/index_skipscan-test-2.cc@62 PS3, Line 62: > nit: constants such as these should be named kOnePK, kTwoPK, etc according Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/index_skipscan-test-2.cc@87 PS3, Line 87: > use kOnePK here instead of a magic number, also below Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/index_skipscan-test-2.cc@122 PS3, Line 122: : : : : : : > is this used? No. Thanks for pointing this out. Added its usage now. http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/index_skipscan-test-2.cc@173 PS3, Line 173: > All of your test cases use sequential row values. We should add a test with Added 3 such use cases (that end with "random"). http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/index_skipscan-test-2.cc@184 PS3, Line 184: : : : > this seems very copy / paste, if the intention is to use the same values fo Done. Defined these values at the beginning. http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/index_skipscan-test-2.cc@273 PS3, Line 273: > describe your test case here Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/index_skipscan-test-2.cc@276 PS3, Line 276: > nit: avoid using non-descriptive variable names like p1, I'm not sure what Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/index_skipscan-test-2.cc@284 PS3, Line 284: > describe this test Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tablet/index_skipscan-test-2.cc@294 PS3, Line 294: > please add a very short description of what we're testing here Done http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tserver/tablet_server-test-2.cc File src/kudu/tserver/tablet_server-test-2.cc: http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tserver/tablet_server-test-2.cc@158 PS3, Line 158: > Add a comment describing the purpose of this test and use a different name This file is removed now. This was an earlier version of test that I was using. http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tserver/tablet_server-test-base.cc File src/kudu/tserver/tablet_server-test-base.cc: http://gerrit.cloudera.org:8080/#/c/10983/3/src/kudu/tserver/tablet_server-test-base.cc@85 PS3, Line 85: : schema_(GetSimpleTestSchema()), : ts_test_metric_entity_(METR > do this in your test child classes, not in the base class Done -- To view, visit http://gerrit.cloudera.org:8080/10983 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f Gerrit-Change-Number: 10983 Gerrit-PatchSet: 4 Gerrit-Owner: Anupama Gupta <anupama.gu...@cloudera.com> Gerrit-Reviewer: Anupama Gupta <anupama.gu...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <mpe...@apache.org> Gerrit-Reviewer: Tidy Bot Gerrit-Comment-Date: Mon, 30 Jul 2018 18:56:26 +0000 Gerrit-HasComments: Yes