[kudu-CR] KUDU-2566: Enhance rowset tree pruning and stop coping string while querying

2018-09-09 Thread helifu (Code Review)
Hello Will Berkeley, Tidy Bot, Kudu Jenkins, Adar Dembo,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11381

to look at the new patch set (#6).

Change subject: KUDU-2566: Enhance rowset tree pruning and stop coping string   
 while querying
..

KUDU-2566: Enhance rowset tree pruning and stop coping string
   while querying

Three improvements:
1) Support open-ended intervals:
   Previously, the rowset tree could only compute overlap with a
   closed and bounded interval. This changeset adds the ability to
   find overlap for unbounded intervals as well. As a result, tablet
   iterators with a single primary key bound are more efficient
   because they do not iterate over rowsets that don't overlap with
   the key bound.
2) End up fetching one more rowset for the upper bound which is
   exclusive:
   This changeset also adds the ability to query the rowset tree
   using an exclusive upper bound, whereas previously only inclusive
   intervals were supported. This makes scans more efficient since
   primary key upper bounds are passed as exclusive bounds, so now
   rowsets that overlap only in the upper bound are excluded.
3) Perf improvement:
   using raw slices instead of copying to strings while querying.
   The copying from slices to string is discarded, which could help
   to increase the query performance.

Change-Id: I0e34b4169f93f3519f694c9bf86ca5295c318e79
---
M src/kudu/tablet/rowset_tree-test.cc
M src/kudu/tablet/rowset_tree.cc
M src/kudu/tablet/rowset_tree.h
M src/kudu/tablet/tablet.cc
M src/kudu/util/interval_tree-inl.h
M src/kudu/util/interval_tree-test.cc
M src/kudu/util/interval_tree.h
7 files changed, 369 insertions(+), 70 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/81/11381/6
--
To view, visit http://gerrit.cloudera.org:8080/11381
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0e34b4169f93f3519f694c9bf86ca5295c318e79
Gerrit-Change-Number: 11381
Gerrit-PatchSet: 6
Gerrit-Owner: helifu 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] [gutil] add LookupOrEmplace()

2018-09-09 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11401 )

Change subject: [gutil] add LookupOrEmplace()
..


Patch Set 1:

(3 comments)

The change looks fine with me, though I think Adar may the same change lying 
around somewhere.

See https://gerrit.cloudera.org/c/11303/4/src/kudu/gutil/map-util.h#464 for a 
discussion we had about it and emplace semantics.

http://gerrit.cloudera.org:8080/#/c/11401/1/src/kudu/gutil/map-util.h
File src/kudu/gutil/map-util.h:

http://gerrit.cloudera.org:8080/#/c/11401/1/src/kudu/gutil/map-util.h@498
PS1, Line 498: // It's similar to LookupOrInsert() but uses the emplace and 
r-value mechanics
 : // to achive the desired results.
Worth noting that if there is an element with the key already exists in the 
container, the arguments will still be "used up".

"The element may be constructed even if there already is an element with the 
key in the container, in which case the newly constructed element will be 
destroyed immediately."

>From https://en.cppreference.com/w/cpp/container/map/emplace


http://gerrit.cloudera.org:8080/#/c/11401/1/src/kudu/util/map-util-test.cc
File src/kudu/util/map-util-test.cc:

http://gerrit.cloudera.org:8080/#/c/11401/1/src/kudu/util/map-util-test.cc@133
PS1, Line 133: auto* val_ptr = FindOrNull(int_map, key);
 : ASSERT_NE(nullptr, val_ptr);
Is there a particular reason to use `FindOrNull()` + `ASSERT_NE(nullptr)` 
instead of `FindOrDie()`? To introduce test coverage for `FindOrNull()` perhaps?


http://gerrit.cloudera.org:8080/#/c/11401/1/src/kudu/util/map-util-test.cc@171
PS1, Line 171: //ASSERT_NE(nullptr, val.get());
Uncomment? Below too. Also see my comment in the header about documenting these 
semantics.



--
To view, visit http://gerrit.cloudera.org:8080/11401
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a177f410c1d4ed33e6f44cb7d6c33d6141f84fc
Gerrit-Change-Number: 11401
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Mon, 10 Sep 2018 03:16:37 +
Gerrit-HasComments: Yes


[kudu-CR] [jepsen] set default admin operation timeout to 5 minutes

2018-09-09 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins, Hao Hao,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11405

to look at the new patch set (#2).

Change subject: [jepsen] set default admin operation timeout to 5 minutes
..

[jepsen] set default admin operation timeout to 5 minutes

Increased the default admin operation timeout for the Kudu client
used in Jepsen tests up to 5 minutes.  That's to work around slow DNS
resolutions (sometimes up to 30 seconds) that happen intermittently
while running the tests at VMs provisioned in GCE.

Change-Id: Iaa9ade5cf38b4f3247749fa33bd40ada9616f21f
---
M java/kudu-jepsen/src/main/clojure/jepsen/kudu/client.clj
1 file changed, 3 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/05/11405/2
--
To view, visit http://gerrit.cloudera.org:8080/11405
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iaa9ade5cf38b4f3247749fa33bd40ada9616f21f
Gerrit-Change-Number: 11405
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-09 Thread Anupama Gupta (Code Review)
Anupama Gupta has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11263/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11263/4//COMMIT_MSG@6
PS4, Line 6:
   : Blogpost describing index skip scan optimization.
   :
> Thanks for this Andrew. I am still not sure why images are not getting rend
Resolved this issue now. Added a link to the rendered version.



--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 6
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Sun, 09 Sep 2018 20:13:26 +
Gerrit-HasComments: Yes


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-09 Thread Anupama Gupta (Code Review)
Anupama Gupta has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 6:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md:

http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@73
PS4, Line 73: exceeds 
![](http://latex.codecogs.com/gif.download?%5Csqrt%20%7B%20%5C%23rows%5C%20in%5C%20tablet%20%7D).
: Therefore, in order to use skip scan performance benefits when 
possible and maintain a consistent performance with
> I think it's the number of rows in the CFileSet, which I think is also the
You are right ! How about rewording this to "rows in tablet" ?


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md:

http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@9
PS5, Line 9: index skip scan (a.k.
> It's great that you found another reference to the same idea in the google'
Sounds good. Done.


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@13
PS5, Line 13: Let's b
> nit: probably don't need this
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@40
PS5, Line 40:  an option
> nit: probably don't need this
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@38
PS5, Line 38:
: Instead, a full table scan is done by default. Other databases 
may optimize such scans by building secondary indexes
: (though it might be redundant to build one on one of the
> Let's stick with a single concrete example, say `tstamp`. Then we can point
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@41
PS5, Line 41: given its lack of secondary index support.
: 
: The question is, can Kudu do better than a full table scan here?
:
> nit: I think this would read better after L45. E.g.
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@47
PS5, Line 47: in the in
> nit: since this is a concrete example, we know there is only one column bef
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@50
PS5, Line 50:
: {% highlight SQL %}
> nit: reword as "to **skip** to the rows that have distinct prefix keys, and
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@61
PS5, Line 61: ce, this metho
> nit: "Kudu tablet" or "tablet server" or "Kudu"
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@61
PS5, Line 61: as **skip scan optimization**[2-3].
:
: Performance
> Maybe reverse the order of **skip** and **scan**, since the name is "skip s
Done. You are correct, I have rephrased the sentence to better clarify this 
point.


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@70
PS5, Line 70:
> nit: add "the" in front of "Lower" and "better"
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@71
PS5, Line 71: nts, on up to 10 million rows per t
> I seem to recall a plot that showed the performance without the dynamic dis
That's a good point. Unfortunately, I do not have the backup of that slide.


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@77
PS5, Line 77:
> nit: skips? for consistency with the "skip" and "scan" terminology
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@77
PS5, Line 77: It will be an in
> nit: I think it's clear enough that this may refer to multiple, so maybe ju
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@89
PS5, Line 89:
> nit: one (`host`)
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@104
PS5, Line 104: [[1
> Do you feel good about adding one more reference?  I think https://www.sqli
Thanks so much for this suggestion. I have added this reference too.


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@105
PS5, Line 105: Geo-
> nit: usually in the reference section they use '[x]' where it's possible to
Thank you for pointing this out. Done.



--
To view, visit 

[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-09 Thread Anupama Gupta (Code Review)
Hello Alexey Serbin, Mike Percy, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11263

to look at the new patch set (#6).

Change subject: Blogpost describing index skip scan optimization.
..

Blogpost describing index skip scan optimization.

Link to the version with images:
https://github.com/AnupamaGupta01/kudu/blob/blogpost-2/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md

Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
---
A _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
A img/index-skip-scan/example-table.png
A img/index-skip-scan/skip-scan-example-table.png
A img/index-skip-scan/skip-scan-performance-graph.png
4 files changed, 111 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/63/11263/6
--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 6
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Mike Percy