Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11263 )
Change subject: Blogpost describing index skip scan optimization. ...................................................................... Patch Set 6: (14 comments) Almost all nits pretty much. Looking good! http://gerrit.cloudera.org:8080/#/c/11263/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11263/6//COMMIT_MSG@9 PS6, Line 9: Link to the version with images: Very nice :) http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md: http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@39 PS6, Line 39: table nit: tablet, here and elsewhere http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@45 PS6, Line 45: (we will refer to it as : `prefix column` and it's specific value as `prefix key`) nit: since we're not using these as a variable names, but rather as definitions, we should use quotations. Also drop the apostrophe in "its". I.e: we will refer to it as the "prefix column" and its specific value as the "prefix key" http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@48 PS6, Line 48: Therefore, we can use the index to **skip** to the rows that have distinct prefix keys, : and also satisfy the predicate on the `tstamp` column. nit: maybe drop the ** around "skip" here, since you do it down below anyway. http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@58 PS6, Line 58: `host` = helium nit: would be nice if the entire thing were in backticks, since it's a condition? Seems a little awkward, this mix of backticks and no backticks. WDYT? http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@60 PS6, Line 60: such as `ubuntu`, `westeros` nit: probably not needed http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@59 PS6, Line 59: with : this prefix key nit: reword "until the predicate no longer matches. At that point we would know that no more rows with `host = helium` will satisfy the predicate, and we can skip to the next prefix key. http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@67 PS6, Line 67: (s) nit: this is a little distracting, below too. Let's just keep it singular since you call out at the end that it can be any number of prefix columns at the end anyway. http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@73 PS6, Line 73: ![](http://latex.codecogs.com/gif.download?%5Csqrt%20%7B%20%5C%23rows%5C%20in%5C%20tablet%20%7D). This and below are being rendered weirdly by github. Would like to confirm this doesn't happen with jekyll http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@74 PS6, Line 74: consistent performance with : respect to the prefix columns cardinality "consistent performance in cases of large prefix column cardinality" http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@93 PS6, Line 93: on the non-first key columns(s) nit: probably not needed, below too. http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@94 PS6, Line 94: IN list nit: In-list http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@98 PS6, Line 98: Team nit: team http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@101 PS6, Line 101: References : ========== : : [[1]](https://storage.googleapis.com/pub-tools-public-publication-data/pdf/42851.pdf): Gupta, Ashish, et al. "Mesa: : Geo-replicated, near real-time, scalable data warehousing." Proceedings of the VLDB Endowment 7.12 (2014): 1259-1270. : : [[2]](https://oracle-base.com/articles/9i/index-skip-scanning/): Index Skip Scanning - Oracle Database : : [[3]](https://www.sqlite.org/optoverview.html#skipscan): Skip Scan - SQLite It's really up to you, but WDYT about just linking these in-line? This is a webpage after all :) -- To view, visit http://gerrit.cloudera.org:8080/11263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-MessageType: comment Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e Gerrit-Change-Number: 11263 Gerrit-PatchSet: 6 Gerrit-Owner: Anupama Gupta <ag3...@columbia.edu> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Anupama Gupta <ag3...@columbia.edu> Gerrit-Reviewer: Attila Bukor <abu...@apache.org> Gerrit-Reviewer: Mike Percy <mpe...@apache.org> Gerrit-Comment-Date: Wed, 12 Sep 2018 00:08:28 +0000 Gerrit-HasComments: Yes