[kudu-CR] KUDU-2566: Enhance rowset tree pruning and stop coping string while querying
Hello Will Berkeley, Tidy Bot, Kudu Jenkins, Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11381 to look at the new patch set (#6). Change subject: KUDU-2566: Enhance rowset tree pruning and stop coping string while querying .. KUDU-2566: Enhance rowset tree pruning and stop coping string while querying Three improvements: 1) Support open-ended intervals: Previously, the rowset tree could only compute overlap with a closed and bounded interval. This changeset adds the ability to find overlap for unbounded intervals as well. As a result, tablet iterators with a single primary key bound are more efficient because they do not iterate over rowsets that don't overlap with the key bound. 2) End up fetching one more rowset for the upper bound which is exclusive: This changeset also adds the ability to query the rowset tree using an exclusive upper bound, whereas previously only inclusive intervals were supported. This makes scans more efficient since primary key upper bounds are passed as exclusive bounds, so now rowsets that overlap only in the upper bound are excluded. 3) Perf improvement: using raw slices instead of copying to strings while querying. The copying from slices to string is discarded, which could help to increase the query performance. Change-Id: I0e34b4169f93f3519f694c9bf86ca5295c318e79 --- M src/kudu/tablet/rowset_tree-test.cc M src/kudu/tablet/rowset_tree.cc M src/kudu/tablet/rowset_tree.h M src/kudu/tablet/tablet.cc M src/kudu/util/interval_tree-inl.h M src/kudu/util/interval_tree-test.cc M src/kudu/util/interval_tree.h 7 files changed, 369 insertions(+), 70 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/81/11381/6 -- To view, visit http://gerrit.cloudera.org:8080/11381 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0e34b4169f93f3519f694c9bf86ca5295c318e79 Gerrit-Change-Number: 11381 Gerrit-PatchSet: 6 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley
[kudu-CR] [gutil] add LookupOrEmplace()
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11401 ) Change subject: [gutil] add LookupOrEmplace() .. Patch Set 1: (3 comments) The change looks fine with me, though I think Adar may the same change lying around somewhere. See https://gerrit.cloudera.org/c/11303/4/src/kudu/gutil/map-util.h#464 for a discussion we had about it and emplace semantics. http://gerrit.cloudera.org:8080/#/c/11401/1/src/kudu/gutil/map-util.h File src/kudu/gutil/map-util.h: http://gerrit.cloudera.org:8080/#/c/11401/1/src/kudu/gutil/map-util.h@498 PS1, Line 498: // It's similar to LookupOrInsert() but uses the emplace and r-value mechanics : // to achive the desired results. Worth noting that if there is an element with the key already exists in the container, the arguments will still be "used up". "The element may be constructed even if there already is an element with the key in the container, in which case the newly constructed element will be destroyed immediately." >From https://en.cppreference.com/w/cpp/container/map/emplace http://gerrit.cloudera.org:8080/#/c/11401/1/src/kudu/util/map-util-test.cc File src/kudu/util/map-util-test.cc: http://gerrit.cloudera.org:8080/#/c/11401/1/src/kudu/util/map-util-test.cc@133 PS1, Line 133: auto* val_ptr = FindOrNull(int_map, key); : ASSERT_NE(nullptr, val_ptr); Is there a particular reason to use `FindOrNull()` + `ASSERT_NE(nullptr)` instead of `FindOrDie()`? To introduce test coverage for `FindOrNull()` perhaps? http://gerrit.cloudera.org:8080/#/c/11401/1/src/kudu/util/map-util-test.cc@171 PS1, Line 171: //ASSERT_NE(nullptr, val.get()); Uncomment? Below too. Also see my comment in the header about documenting these semantics. -- To view, visit http://gerrit.cloudera.org:8080/11401 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3a177f410c1d4ed33e6f44cb7d6c33d6141f84fc Gerrit-Change-Number: 11401 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Mon, 10 Sep 2018 03:16:37 + Gerrit-HasComments: Yes
[kudu-CR] [jepsen] set default admin operation timeout to 5 minutes
Hello Kudu Jenkins, Hao Hao, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11405 to look at the new patch set (#2). Change subject: [jepsen] set default admin operation timeout to 5 minutes .. [jepsen] set default admin operation timeout to 5 minutes Increased the default admin operation timeout for the Kudu client used in Jepsen tests up to 5 minutes. That's to work around slow DNS resolutions (sometimes up to 30 seconds) that happen intermittently while running the tests at VMs provisioned in GCE. Change-Id: Iaa9ade5cf38b4f3247749fa33bd40ada9616f21f --- M java/kudu-jepsen/src/main/clojure/jepsen/kudu/client.clj 1 file changed, 3 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/05/11405/2 -- To view, visit http://gerrit.cloudera.org:8080/11405 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iaa9ade5cf38b4f3247749fa33bd40ada9616f21f Gerrit-Change-Number: 11405 Gerrit-PatchSet: 2 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins
[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.
Anupama Gupta has posted comments on this change. ( http://gerrit.cloudera.org:8080/11263 ) Change subject: Blogpost describing index skip scan optimization. .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/11263/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11263/4//COMMIT_MSG@6 PS4, Line 6: : Blogpost describing index skip scan optimization. : > Thanks for this Andrew. I am still not sure why images are not getting rend Resolved this issue now. Added a link to the rendered version. -- To view, visit http://gerrit.cloudera.org:8080/11263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-MessageType: comment Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e Gerrit-Change-Number: 11263 Gerrit-PatchSet: 6 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Mike Percy Gerrit-Comment-Date: Sun, 09 Sep 2018 20:13:26 + Gerrit-HasComments: Yes
[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.
Anupama Gupta has posted comments on this change. ( http://gerrit.cloudera.org:8080/11263 ) Change subject: Blogpost describing index skip scan optimization. .. Patch Set 6: (17 comments) http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md: http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@73 PS4, Line 73: exceeds ![](http://latex.codecogs.com/gif.download?%5Csqrt%20%7B%20%5C%23rows%5C%20in%5C%20tablet%20%7D). : Therefore, in order to use skip scan performance benefits when possible and maintain a consistent performance with > I think it's the number of rows in the CFileSet, which I think is also the You are right ! How about rewording this to "rows in tablet" ? http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md: http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@9 PS5, Line 9: index skip scan (a.k. > It's great that you found another reference to the same idea in the google' Sounds good. Done. http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@13 PS5, Line 13: Let's b > nit: probably don't need this Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@40 PS5, Line 40: an option > nit: probably don't need this Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@38 PS5, Line 38: : Instead, a full table scan is done by default. Other databases may optimize such scans by building secondary indexes : (though it might be redundant to build one on one of the > Let's stick with a single concrete example, say `tstamp`. Then we can point Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@41 PS5, Line 41: given its lack of secondary index support. : : The question is, can Kudu do better than a full table scan here? : > nit: I think this would read better after L45. E.g. Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@47 PS5, Line 47: in the in > nit: since this is a concrete example, we know there is only one column bef Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@50 PS5, Line 50: : {% highlight SQL %} > nit: reword as "to **skip** to the rows that have distinct prefix keys, and Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@61 PS5, Line 61: ce, this metho > nit: "Kudu tablet" or "tablet server" or "Kudu" Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@61 PS5, Line 61: as **skip scan optimization**[2-3]. : : Performance > Maybe reverse the order of **skip** and **scan**, since the name is "skip s Done. You are correct, I have rephrased the sentence to better clarify this point. http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@70 PS5, Line 70: > nit: add "the" in front of "Lower" and "better" Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@71 PS5, Line 71: nts, on up to 10 million rows per t > I seem to recall a plot that showed the performance without the dynamic dis That's a good point. Unfortunately, I do not have the backup of that slide. http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@77 PS5, Line 77: > nit: skips? for consistency with the "skip" and "scan" terminology Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@77 PS5, Line 77: It will be an in > nit: I think it's clear enough that this may refer to multiple, so maybe ju Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@89 PS5, Line 89: > nit: one (`host`) Done http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@104 PS5, Line 104: [[1 > Do you feel good about adding one more reference? I think https://www.sqli Thanks so much for this suggestion. I have added this reference too. http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@105 PS5, Line 105: Geo- > nit: usually in the reference section they use '[x]' where it's possible to Thank you for pointing this out. Done. -- To view, visit
[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.
Hello Alexey Serbin, Mike Percy, Andrew Wong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11263 to look at the new patch set (#6). Change subject: Blogpost describing index skip scan optimization. .. Blogpost describing index skip scan optimization. Link to the version with images: https://github.com/AnupamaGupta01/kudu/blob/blogpost-2/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e --- A _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md A img/index-skip-scan/example-table.png A img/index-skip-scan/skip-scan-example-table.png A img/index-skip-scan/skip-scan-performance-graph.png 4 files changed, 111 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/63/11263/6 -- To view, visit http://gerrit.cloudera.org:8080/11263 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: gh-pages Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e Gerrit-Change-Number: 11263 Gerrit-PatchSet: 6 Gerrit-Owner: Anupama Gupta Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Anupama Gupta Gerrit-Reviewer: Mike Percy