Anupama Gupta has posted comments on this change. ( http://gerrit.cloudera.org:8080/11263 )
Change subject: Blogpost describing index skip scan optimization. ...................................................................... Patch Set 6: (17 comments) http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md: http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@73 PS4, Line 73: exceeds . : Therefore, in order to use skip scan performance benefits when possible and maintain a consistent performance with > I think it's the number of rows in the CFileSet, which I think is also the You are right ! How about rewording this to "rows in tablet" ? http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md: http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@9 PS5, Line 9: index skip scan (a.k. > It's great that you found another reference to the same idea in the google' Sounds good. Done. http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@13 PS5, Line 13: Let's b > nit: probably don't need this Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@40 PS5, Line 40: an option > nit: probably don't need this Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@38 PS5, Line 38: : Instead, a full table scan is done by default. Other databases may optimize such scans by building secondary indexes : (though it might be redundant to build one on one of the > Let's stick with a single concrete example, say `tstamp`. Then we can point Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@41 PS5, Line 41: given its lack of secondary index support. : : The question is, can Kudu do better than a full table scan here? : > nit: I think this would read better after L45. E.g. Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@47 PS5, Line 47: in the in > nit: since this is a concrete example, we know there is only one column bef Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@50 PS5, Line 50: : {% highlight SQL %} > nit: reword as "to **skip** to the rows that have distinct prefix keys, and Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@61 PS5, Line 61: ce, this metho > nit: "Kudu tablet" or "tablet server" or "Kudu" Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@61 PS5, Line 61: as **skip scan optimization**[2-3]. : : Performance > Maybe reverse the order of **skip** and **scan**, since the name is "skip s Done. You are correct, I have rephrased the sentence to better clarify this point. http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@70 PS5, Line 70: > nit: add "the" in front of "Lower" and "better" Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@71 PS5, Line 71: nts, on up to 10 million rows per t > I seem to recall a plot that showed the performance without the dynamic dis That's a good point. Unfortunately, I do not have the backup of that slide. http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@77 PS5, Line 77: > nit: skips? for consistency with the "skip" and "scan" terminology Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@77 PS5, Line 77: It will be an in > nit: I think it's clear enough that this may refer to multiple, so maybe ju Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@89 PS5, Line 89: > nit: one (`host`) Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@104 PS5, Line 104: [[1 > Do you feel good about adding one more reference? I think https://www.sqli Thanks so much for this suggestion. I have added this reference too. http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@105 PS5, Line 105: Geo- > nit: usually in the reference section they use '[x]' where it's possible to Thank you for pointing this out. Done. -- To view, visit http://gerrit.cloudera.org:8080/11263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-MessageType: comment Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e Gerrit-Change-Number: 11263 Gerrit-PatchSet: 6 Gerrit-Owner: Anupama Gupta <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Anupama Gupta <[email protected]> Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Comment-Date: Sun, 09 Sep 2018 19:07:51 +0000 Gerrit-HasComments: Yes
