Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10983 )

Change subject: Kudu-1291 Efficiently support predicates on non-prefix key 
components
......................................................................


Patch Set 5:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.h
File src/kudu/tablet/cfile_set.h:

http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.h@217
PS5, Line 217:   // This is a three seek approach for index skip scan 
implementation:
             :   // 1. Place the validx_iter_ at the entry containing the next
             :   //    unique "prefix_key" value.
             :   // 2. Place the iterator at or after the entry formed by the 
"prefix_key"
             :   //    found in 1. and the predicate value.
             :   // 3. If the seek in 2. results in exact match with the 
predicate value,
             :   //    store the row id that represents the last relevant entry 
(upper bound) wrt the
             :   //    current "prefix_key"(found in 1.)
I think this comment should be moved to the .cc file.

Instead of this comment, how about something describing the API we are 
providing here, perhaps something along these lines:

  // This method implements a "skip-scan" optimization, allowing a scan to use
  // the primary key index to efficiently seek to matching rows where there are
  // predicates on compound key columns that do not necessarily include the
  // leading column of the primary key. At the time of writing, only a single
  // equality predicate is supported, although the algorithm can support ranges
  // of values.
  //
  // This method should be invoked during the PrepareBatch() phase of the row
  // iterator lifecycle.
  //
  // The in-out parameter 'remaining' refers to the number of rows remaining to
  // scan. When this method is invoked, 'remaining' should contain the maximum
  // number of remaining rows available to scan. Once this method returns,
  // 'remaining' will contain the number of rows to scan to consume the
  // available matching rows according to the equality predicate. Note:
  // 'remaining' will always be at least 1, although it is a TODO to allow it
  // to be 0 (0 violates CHECK conditions elsewhere in the scan code).

  // Preconditions upon entering this method:
  // - ...
  //
  // Postconditions upon entering this method:
  // - The iterator will be reset... (add details about what member variables 
might be changed)
  //
  // See the .cc file for details on the approach and the implementation.


http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc
File src/kudu/tablet/cfile_set.cc:

http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@565
PS5, Line 565: Get
s/Get/Return/


http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@568
PS5, Line 568:   LOG(INFO) << " encoded key in build new encoded key = " << 
new_enc_key->Stringify(base_data_->tablet_schema());
remove or VLOG(2)


http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@581
PS5, Line 581: column_value_1
nit: use more descriptive variable names here


http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@588
PS5, Line 588: UseSkipScan
how about rename this to: SkipToNextScan()


http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@602
PS5, Line 602:     // TODO: If we loop > 100 times, break out of the loop.
I'm not comfortable merging this patch without this feature


http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@662
PS5, Line 662:
exclusive upper bound


http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@668
PS5, Line 668:       // We hit the end of the file, so we will simply scan to 
the end of the
             :       // file.
what guarantees the skip_scan_upper_bound_idx_ has been set to 
upper_bound_idx_? Should we explicitly set that?


http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@675
PS5, Line 675: We
wording nit: Check to see whether we have seeked backwards. If so, we need to 
keep looking...


http://gerrit.cloudera.org:8080/#/c/10983/5/src/kudu/tablet/cfile_set.cc@684
PS5, Line 684:   } while (!predicate_match);
can we convert this to a while-loop instead of a do-while-loop, now that the 
loop condition is so simple? We can initialize the loop with bool 
predicate_match = false



--
To view, visit http://gerrit.cloudera.org:8080/10983
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f
Gerrit-Change-Number: 10983
Gerrit-PatchSet: 5
Gerrit-Owner: Anupama Gupta <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Anupama Gupta <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Thu, 02 Aug 2018 18:30:00 +0000
Gerrit-HasComments: Yes

Reply via email to