Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
......................................................................


Patch Set 6:

(14 comments)

Almost all nits pretty much. Looking good!

http://gerrit.cloudera.org:8080/#/c/11263/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11263/6//COMMIT_MSG@9
PS6, Line 9: Link to the version with images:
Very nice :)


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md:

http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@39
PS6, Line 39: table
nit: tablet, here and elsewhere


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@45
PS6, Line 45: (we will refer to it as
            : `prefix column` and it's specific value as `prefix key`)
nit: since we're not using these as a variable names, but rather as 
definitions, we should use quotations. Also drop the apostrophe in "its". I.e:

we will refer to it as the "prefix column" and its specific value as the 
"prefix key"


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@48
PS6, Line 48: Therefore, we can use the index to **skip** to the rows that have 
distinct prefix keys,
            : and also satisfy the predicate on the `tstamp` column.
nit: maybe drop the ** around "skip" here, since you do it down below anyway.


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@58
PS6, Line 58: `host` = helium
nit: would be nice if the entire thing were in backticks, since it's a 
condition? Seems a little awkward, this mix of backticks and no backticks. WDYT?


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@60
PS6, Line 60:  such as `ubuntu`, `westeros`
nit: probably not needed


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@59
PS6, Line 59:  with
            : this prefix key
nit: reword "until the predicate no longer matches. At that point we would know 
that no more rows with `host = helium` will satisfy the predicate, and we can 
skip to the next prefix key.


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@67
PS6, Line 67: (s)
nit: this is a little distracting, below too. Let's just keep it singular since 
you call out at the end that it can be any number of prefix columns at the end 
anyway.


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@73
PS6, Line 73: 
![](http://latex.codecogs.com/gif.download?%5Csqrt%20%7B%20%5C%23rows%5C%20in%5C%20tablet%20%7D).
This and below are being rendered weirdly by github. Would like to confirm this 
doesn't happen with jekyll


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@74
PS6, Line 74: consistent performance with
            : respect to the prefix columns cardinality
"consistent performance in cases of large prefix column cardinality"


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@93
PS6, Line 93:  on the non-first key columns(s)
nit: probably not needed, below too.


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@94
PS6, Line 94: IN list
nit: In-list


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@98
PS6, Line 98: Team
nit: team


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@101
PS6, Line 101: References
             : ==========
             :
             : 
[[1]](https://storage.googleapis.com/pub-tools-public-publication-data/pdf/42851.pdf):
 Gupta, Ashish, et al. "Mesa:
             : Geo-replicated, near real-time, scalable data warehousing." 
Proceedings of the VLDB Endowment 7.12 (2014): 1259-1270.
             :
             : [[2]](https://oracle-base.com/articles/9i/index-skip-scanning/): 
Index Skip Scanning - Oracle Database
             :
             : [[3]](https://www.sqlite.org/optoverview.html#skipscan): Skip 
Scan - SQLite
It's really up to you, but WDYT about just linking these in-line? This is a 
webpage after all :)



--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 6
Gerrit-Owner: Anupama Gupta <ag3...@columbia.edu>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Anupama Gupta <ag3...@columbia.edu>
Gerrit-Reviewer: Attila Bukor <abu...@apache.org>
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Comment-Date: Wed, 12 Sep 2018 00:08:28 +0000
Gerrit-HasComments: Yes

Reply via email to