Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10983 )

Change subject: KUDU-1291. Efficiently support predicates on non-prefix key 
components
......................................................................


Patch Set 9:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.h
File src/kudu/tablet/cfile_set.h:

http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.h@260
PS9, Line 260: cur_idx_ >= skip_scan_upper_bound_idx_.
this is no longer true (checked inside the method)


http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.h@264
PS9, Line 264:   // 1. key_iter_->cur_val_ stores the first entry containing 
both the next unique
             :   //    prefix key and the predicate value, remaining to be 
scanned in the current batch.
Is this an implementation detail or part of the interface of this method? If 
it's the former then we should remove it from the header file.


http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.h@266
PS9, Line 266:   // 2. skip_scan_lower_bound_idx will store the row id of the 
entry found in 1.
             :   // 3. skip_scan_upper_bound_idx is the row id which is an 
exclusive upper bound
             :   //    on our current scan range.
These variables are local to this method, so I don't see why a caller would 
care about this. If we want to document this, we should document it in the 
implementation file.


http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.h@269
PS9, Line 269:   // 4. cur_idx_ is either same as skip_scan_lower_bound_idx or 
remains unchanged (this occurs
             :   //    when cur_idx_ lies somewhere in between the current scan 
range).
This seems to be one of the key postconditions from my perspective, since it 
affects class-level state that is accessed by other methods in this class. Can 
this be phrased in a way that doesn't reference local implementation variables 
but rather provides an explanation about what this does?


http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.h@271
PS9, Line 271: remaining
nit: 'remaining'


http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc
File src/kudu/tablet/cfile_set.cc:

http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@639
PS9, Line 639:   // 1. Place the validx_iter_ at the entry containing the next
             :   //    unique "prefix_key" value.
             :   // 2. Place the iterator at or after the entry formed by the 
"prefix_key"
             :   //    found in 1. and the predicate value.
             :   // 3. If the seek in 2. results in exact match with the 
predicate value,
             :   //    store the row id that represents the last relevant entry 
(upper bound) wrt the
             :   //    current "prefix_key"(found in 1.)
How about using something a bit higher level to describe this algorithm:

1. Search within the ad-hoc index for the next unique prefix (set of keys prior 
to the first predicate column). Searching is done using validx_iter_.
2. Read that unique prefix from the ad-hoc index, append the lower bound of the 
first predicate column, and seek to that. If this matches, this is the lower 
bound of our desired scan.
3. If we found our desired lower bound, find an upper bound for the scan by 
searching for the next row key matching one value higher than the highest value 
that will match our first predicate.


http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@671
PS9, Line 671: skip_scan_searched_cur_prefix_
The values need to be inverted due to the variable name change; it doesn't make 
sense to seek to the next prefix key if we didn't search the current prefix.


http://gerrit.cloudera.org:8080/#/c/10983/9/src/kudu/tablet/cfile_set.cc@820
PS9, Line 820:
nit: unexpected indentation



--
To view, visit http://gerrit.cloudera.org:8080/10983
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f
Gerrit-Change-Number: 10983
Gerrit-PatchSet: 9
Gerrit-Owner: Anupama Gupta <anupama.gu...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Anupama Gupta <anupama.gu...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Mon, 06 Aug 2018 21:04:26 +0000
Gerrit-HasComments: Yes

Reply via email to