Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10983 )

Change subject: KUDU-1291: efficiently support predicates on non-prefix key 
components
......................................................................


Patch Set 19:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/10983/19//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10983/19//COMMIT_MSG@19
PS19, Line 19: untill
nit: only one 'l'


http://gerrit.cloudera.org:8080/#/c/10983/19/src/kudu/cfile/cfile_reader.cc
File src/kudu/cfile/cfile_reader.cc:

http://gerrit.cloudera.org:8080/#/c/10983/19/src/kudu/cfile/cfile_reader.cc@794
PS19, Line 794: 8192
nit: should probably be using FLAGS_max_encoded_key_size_bytes here


http://gerrit.cloudera.org:8080/#/c/10983/19/src/kudu/tablet/cfile_set.h
File src/kudu/tablet/cfile_set.h:

http://gerrit.cloudera.org:8080/#/c/10983/19/src/kudu/tablet/cfile_set.h@303
PS19, Line 303:  &cur_enc_key,
              :                                   KuduPartialRow *p_row, 
gscoped_ptr<EncodedKey> *
nit: reverse sigil-space order


http://gerrit.cloudera.org:8080/#/c/10983/19/src/kudu/tablet/cfile_set.h@354
PS19, Line 354:  *
nit: here too


http://gerrit.cloudera.org:8080/#/c/10983/16/src/kudu/tablet/index_skipscan-test.cc
File src/kudu/tablet/index_skipscan-test.cc:

http://gerrit.cloudera.org:8080/#/c/10983/16/src/kudu/tablet/index_skipscan-test.cc@104
PS16, Line 104:
              :   IndexSkipScanTest()
              :       : KuduTabletTest(CreateSchema(std::get<0>(GetParam()))) {
              :     const auto& param = GetParam();
              :     schema_type = std::get<0>(param);
              :     FLAGS_enable_skip_scan = std::get<1>(param);
              :   }
              :
              :   void SetUp() override {
              :     KuduTabletTest::SetUp();
              :     
ASSERT_OK(tablet()->metadata()->CreateRowSet(&rowset_meta_));
              :     FillTestTablet();
              :   }
              :
              :   // Generates and inserts given number of rows using the given 
PRNG,
              :   // returns # rows generated that match the predicate_val on 
predicate_col.
              :   int GenerateData(Random random, int num_rows, int 
predicate_col_id, int64_t predicate_value) {
              :
              :     LocalTabletWriter writer(tablet().get(), &client_schema_);
              :     KuduPartialRow row(&client_schema_);
              :
              :     size_t num_key_cols = client_schema_.num_key_columns();
              :     int num_matching = 0;
              :
              :     while (num_rows > 0) {
              :       bool predicate_value_matched = false;
              :       for (int col_idx = 0; col_idx < num_key_cols; col_idx++) {
              :         int64_t value = random.Uniform(1000);
              :         CHECK_OK(row.SetInt64(col_idx, value));
              :         if (col_idx == predicate_col_id && value == 
predicate_value) {
              :           predicate_value_matched = true;
              :         }
              :       }
              :
This is no longer used


http://gerrit.cloudera.org:8080/#/c/10983/17/src/kudu/tablet/index_skipscan-test.cc
File src/kudu/tablet/index_skipscan-test.cc:

http://gerrit.cloudera.org:8080/#/c/10983/17/src/kudu/tablet/index_skipscan-test.cc@589
PS17, Line 589: int num_matching = GenerateData(random, num_rows, 
predicate_col_id, predicate_val);
              :
> Right, the original idea was to generate almost the same order of matching
Seems like GenerateDataEx() accomplishes the "every Nth row" case with N fixed 
at 3, but not the case where all or almost all rows match. Is that still a goal?


http://gerrit.cloudera.org:8080/#/c/10983/19/src/kudu/tablet/index_skipscan-test.cc
File src/kudu/tablet/index_skipscan-test.cc:

http://gerrit.cloudera.org:8080/#/c/10983/19/src/kudu/tablet/index_skipscan-test.cc@153
PS19, Line 153: int GenerateDataEx
nit: rename this to GenerateData and add docs


http://gerrit.cloudera.org:8080/#/c/10983/19/src/kudu/tablet/index_skipscan-test.cc@168
PS19, Line 168: case 1:
              :               CHECK_OK(row.SetInt64(col_idx, predicate_value / 
2 + 1));
              :               match = false;
              :               break;
              :             case 2:
              :               CHECK_OK(row.SetInt64(col_idx, predicate_value / 
2 - 1));
              :               match = false;
              :               break;
Does it matter what these values are? It seems like the predicate column will 
only have three values in it. Maybe instead we can instead inject random data 
(and check them against the predicate) instead?



--
To view, visit http://gerrit.cloudera.org:8080/10983
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f
Gerrit-Change-Number: 10983
Gerrit-PatchSet: 19
Gerrit-Owner: Anupama Gupta <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Anupama Gupta <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Sun, 19 Aug 2018 18:36:42 +0000
Gerrit-HasComments: Yes

Reply via email to