[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-26 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 12:

This post is now live: 
http://kudu.apache.org/2018/09/26/index-skip-scan-optimization-in-kudu.html


--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 12
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Wed, 26 Sep 2018 17:57:21 +
Gerrit-HasComments: No


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-26 Thread Mike Percy (Code Review)
Mike Percy has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..

Blogpost describing index skip scan optimization.

Link to the version with images:
https://github.com/AnupamaGupta01/kudu/blob/blogpost-2/_posts/2018-09-25-index-skip-scan-optimization-in-kudu.md

Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Reviewed-on: http://gerrit.cloudera.org:8080/11263
Reviewed-by: Mike Percy 
Tested-by: Mike Percy 
---
A _posts/2018-09-26-index-skip-scan-optimization-in-kudu.md
A img/index-skip-scan/example-table.png
A img/index-skip-scan/skip-scan-example-table.png
A img/index-skip-scan/skip-scan-performance-graph.png
4 files changed, 114 insertions(+), 0 deletions(-)

Approvals:
  Mike Percy: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: merged
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 12
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-26 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 11: Verified+1 Code-Review+2

Rendered locally with site_tool jekyll serve and it looks good. I'm about to 
push this live.


--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 11
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Wed, 26 Sep 2018 17:50:21 +
Gerrit-HasComments: No


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-26 Thread Mike Percy (Code Review)
Mike Percy has uploaded a new patch set (#11) to the change originally created 
by Anupama Gupta. ( http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..

Blogpost describing index skip scan optimization.

Link to the version with images:
https://github.com/AnupamaGupta01/kudu/blob/blogpost-2/_posts/2018-09-25-index-skip-scan-optimization-in-kudu.md

Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
---
A _posts/2018-09-26-index-skip-scan-optimization-in-kudu.md
A img/index-skip-scan/example-table.png
A img/index-skip-scan/skip-scan-example-table.png
A img/index-skip-scan/skip-scan-performance-graph.png
4 files changed, 114 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/63/11263/11
--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 11
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-26 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 10:

I noticed yesterday afternoon that I got distracted and didn't post this. I'm 
going to wrangle up a new filename and push it out right now.


--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 10
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Wed, 26 Sep 2018 17:25:57 +
Gerrit-HasComments: No


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-25 Thread Attila Bukor (Code Review)
Attila Bukor has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 10: Verified+1

awesome, thanks Anupama


--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 10
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Tue, 25 Sep 2018 08:01:02 +
Gerrit-HasComments: No


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-24 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 10:

I'm planning on pushing this out and tweeting it out from @ApacheKudu tomorrow 
morning (9/25) California time.


--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 10
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Mon, 24 Sep 2018 23:32:29 +
Gerrit-HasComments: No


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-24 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 10: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 10
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Mon, 24 Sep 2018 23:31:40 +
Gerrit-HasComments: No


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-24 Thread Anupama Gupta (Code Review)
Hello Alexey Serbin, Mike Percy, Attila Bukor, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11263

to look at the new patch set (#10).

Change subject: Blogpost describing index skip scan optimization.
..

Blogpost describing index skip scan optimization.

Link to the version with images:
https://github.com/AnupamaGupta01/kudu/blob/blogpost-2/_posts/2018-09-25-index-skip-scan-optimization-in-kudu.md

Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
---
A _posts/2018-09-25-index-skip-scan-optimization-in-kudu.md
A img/index-skip-scan/example-table.png
A img/index-skip-scan/skip-scan-example-table.png
A img/index-skip-scan/skip-scan-performance-graph.png
4 files changed, 114 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/63/11263/10
--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 10
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-24 Thread Anupama Gupta (Code Review)
Anupama Gupta has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 9:

Sorry for the confusion. I have renamed the file name in the github version to 
make it consistent with the last updated change. The updated file name is  
'2018-09-25-index-skip-scan-optimization-in-kudu.md'


--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 9
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Mon, 24 Sep 2018 21:40:25 +
Gerrit-HasComments: No


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-24 Thread Attila Bukor (Code Review)
Attila Bukor has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 9:

> Patch Set 9:
>
> > Patch Set 9: Verified-1
> >
> > we should probably change the date in the filename as it would be below 
> > Mac's pipeline post otherwise, so I'm setting verified to -1 for now.
>
> Attila, I don't understand what you're saying here. Can you clarify?

oh sorry. I meant that this post is dated 2018-08-17 and we already have a 
published one from 2018-09-11, meaning this post wouldn't go to the top, but to 
the second place instead. The file should simply be renamed to 
2018-09-24-index-skip-scan-optimization-in-kudu.md to make sure it goes to the 
top.


--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 9
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Mon, 24 Sep 2018 21:23:53 +
Gerrit-HasComments: No


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-24 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 9:

> Patch Set 9: Verified-1
>
> we should probably change the date in the filename as it would be below Mac's 
> pipeline post otherwise, so I'm setting verified to -1 for now.

Attila, I don't understand what you're saying here. Can you clarify?


--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 9
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Mon, 24 Sep 2018 20:53:13 +
Gerrit-HasComments: No


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-24 Thread Attila Bukor (Code Review)
Attila Bukor has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 9: Verified-1

we should probably change the date in the filename as it would be below Mac's 
pipeline post otherwise, so I'm setting verified to -1 for now.


--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 9
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Mon, 24 Sep 2018 20:34:23 +
Gerrit-HasComments: No


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-24 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 9: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 9
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Mon, 24 Sep 2018 18:54:33 +
Gerrit-HasComments: No


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-24 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 9: Code-Review+2

Thanks! I'll work on getting this posted.


--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 9
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Mon, 24 Sep 2018 16:58:11 +
Gerrit-HasComments: No


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-23 Thread Anupama Gupta (Code Review)
Anupama Gupta has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 8:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md:

http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@10
PS8, Line 10:
> nit: I would add something along these lines here to help transition to the
Done


http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@36
PS8, Line 36: contains
> nit: "only contains the"
Done


http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@74
PS8, Line 74: 
http://latex.codecogs.com/gif.download?%5Csqrt%20%7B%20%5C%23rows%5C%20in%5C%20tablet%20%7D
> that would work too, yeah
Done


http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@76
PS8, Line 76: decide
> s/decide/have tentatively chosen/
Done


http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@77
PS8, Line 77: 
http://latex.codecogs.com/gif.download?%5Csqrt%20%7B%20%5C%23rows%5C%20in%5C%20tablet%20%7D
> maybe, simple text representation would fit as well:
Done


http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@78
PS8, Line 78: take
> s/take/project/
Done


http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@88
PS8, Line 88: current implementation
> s/current implementation/implementation in the patch/
Done


http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@109
PS8, Line 109: [[2]](https://oracle-base.com/articles/9i/index-skip-scanning/): 
Index Skip Scanning - Oracle Database
 :
 : [[3]](https://www.sqlite.org/optoverview.html#skipscan): Skip 
Scan - SQLite
> I see it now; feel free to ignore this.
Done



--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 8
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Sun, 23 Sep 2018 14:25:00 +
Gerrit-HasComments: Yes


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-23 Thread Anupama Gupta (Code Review)
Hello Alexey Serbin, Mike Percy, Attila Bukor, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11263

to look at the new patch set (#9).

Change subject: Blogpost describing index skip scan optimization.
..

Blogpost describing index skip scan optimization.

Link to the version with images:
https://github.com/AnupamaGupta01/kudu/blob/blogpost-2/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md

Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
---
A _posts/2018-09-25-index-skip-scan-optimization-in-kudu.md
A img/index-skip-scan/example-table.png
A img/index-skip-scan/skip-scan-example-table.png
A img/index-skip-scan/skip-scan-performance-graph.png
4 files changed, 114 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/63/11263/9
--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 9
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-18 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md:

http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@74
PS8, Line 74: 
http://latex.codecogs.com/gif.download?%5Csqrt%20%7B%20%5C%23rows%5C%20in%5C%20tablet%20%7D
> As an alternative approach, consider simple text representation
that would work too, yeah



--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 8
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Tue, 18 Sep 2018 23:34:49 +
Gerrit-HasComments: Yes


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-18 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md:

http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@109
PS8, Line 109: [[2]](https://oracle-base.com/articles/9i/index-skip-scanning/): 
Index Skip Scanning - Oracle Database
 :
 : [[3]](https://www.sqlite.org/optoverview.html#skipscan): Skip 
Scan - SQLite
> I'm not seeing references to these links in this blog post
I see it now; feel free to ignore this.



--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 8
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Tue, 18 Sep 2018 23:34:27 +
Gerrit-HasComments: Yes


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-18 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md:

http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@74
PS8, Line 74: 
http://latex.codecogs.com/gif.download?%5Csqrt%20%7B%20%5C%23rows%5C%20in%5C%20tablet%20%7D
> download this, check it in, and include it in the gerrit review please
As an alternative approach, consider simple text representation

sqrt(number_of_rows_in_tablet)

Would fit as well?


http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@77
PS8, Line 77: 
http://latex.codecogs.com/gif.download?%5Csqrt%20%7B%20%5C%23rows%5C%20in%5C%20tablet%20%7D
> please check this in
maybe, simple text representation would fit as well:

sqrt(number_of_rows_in_tablet)

?



--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 8
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Tue, 18 Sep 2018 22:58:04 +
Gerrit-HasComments: Yes


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-18 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 8:

(8 comments)

Sorry for the delay in reviewing this. This looks good, I just have a few 
additional nitpick points of feedback and then I think this is ready to post.

Since we are getting ready to post this, we should change the date of the blog 
post to a date in the future. How about 2018-09-25 (next Tuesday), when we can 
post this?

http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md:

http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@10
PS8, Line 10:
nit: I would add something along these lines here to help transition to the 
next paragraph:

  I wanted to share my experience and the progress we've made so far on the 
approach.


http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@36
PS8, Line 36: contains
nit: "only contains the"


http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@74
PS8, Line 74: 
http://latex.codecogs.com/gif.download?%5Csqrt%20%7B%20%5C%23rows%5C%20in%5C%20tablet%20%7D
download this, check it in, and include it in the gerrit review please


http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@76
PS8, Line 76: decide
s/decide/have tentatively chosen/


http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@77
PS8, Line 77: 
http://latex.codecogs.com/gif.download?%5Csqrt%20%7B%20%5C%23rows%5C%20in%5C%20tablet%20%7D
please check this in


http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@78
PS8, Line 78: take
s/take/project/


http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@88
PS8, Line 88: current implementation
s/current implementation/implementation in the patch/


http://gerrit.cloudera.org:8080/#/c/11263/8/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@109
PS8, Line 109: [[2]](https://oracle-base.com/articles/9i/index-skip-scanning/): 
Index Skip Scanning - Oracle Database
 :
 : [[3]](https://www.sqlite.org/optoverview.html#skipscan): Skip 
Scan - SQLite
I'm not seeing references to these links in this blog post



--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 8
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Tue, 18 Sep 2018 22:28:58 +
Gerrit-HasComments: Yes


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-17 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 8: Code-Review+2

Looks good!  Thank you for the post!


--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 8
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Mon, 17 Sep 2018 17:54:20 +
Gerrit-HasComments: No


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-13 Thread Anupama Gupta (Code Review)
Hello Alexey Serbin, Mike Percy, Attila Bukor, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11263

to look at the new patch set (#8).

Change subject: Blogpost describing index skip scan optimization.
..

Blogpost describing index skip scan optimization.

Link to the version with images:
https://github.com/AnupamaGupta01/kudu/blob/blogpost-2/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md

Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
---
A _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
A img/index-skip-scan/example-table.png
A img/index-skip-scan/skip-scan-example-table.png
A img/index-skip-scan/skip-scan-performance-graph.png
4 files changed, 113 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/63/11263/8
--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 8
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-13 Thread Anupama Gupta (Code Review)
Anupama Gupta has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 8:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/11263/7/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md:

http://gerrit.cloudera.org:8080/#/c/11263/7/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@7
PS7, Line 7: team
> nit: lower-case "team"
Done


http://gerrit.cloudera.org:8080/#/c/11263/7/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@27
PS7, Line 27: table
> nit: lower-case "table"
Done


http://gerrit.cloudera.org:8080/#/c/11263/7/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@45
PS7, Line 45: . We will refer to it as the
: "prefix column" and its specific value as the "prefix k
> nit: drop the parens and start a new sentence instead.
Done


http://gerrit.cloudera.org:8080/#/c/11263/7/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@47
PS7, Line 47:
> nit: remove comma, maybe replace with "that"
Done


http://gerrit.cloudera.org:8080/#/c/11263/7/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@82
PS7, Line 82:
: Conclusion
: ==
> No where does this mention that, as implemented, this works for equality pr
Done


http://gerrit.cloudera.org:8080/#/c/11263/7/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@98
PS7, Line 98: roughly enjo
> nit: full-fledged
Done


http://gerrit.cloudera.org:8080/#/c/11263/7/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@99
PS7, Line 99: right from underst
> nit: "of the skip scan approach" or "of the skip scan optimization"
Done



--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 8
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Thu, 13 Sep 2018 22:16:44 +
Gerrit-HasComments: Yes


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-13 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 7: Code-Review+1

(7 comments)

Some tiny nits and one real suggestion, but otherwise LGTM.

http://gerrit.cloudera.org:8080/#/c/11263/7/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md:

http://gerrit.cloudera.org:8080/#/c/11263/7/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@7
PS7, Line 7: Team
nit: lower-case "team"


http://gerrit.cloudera.org:8080/#/c/11263/7/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@27
PS7, Line 27: Table
nit: lower-case "table"


http://gerrit.cloudera.org:8080/#/c/11263/7/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@45
PS7, Line 45:  (we will refer to it as
: "prefix column" and its specific value as "prefix key")
nit: drop the parens and start a new sentence instead.
Also nit: "as the 'prefix column' and its specific value as the 'prefix key.'"


http://gerrit.cloudera.org:8080/#/c/11263/7/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@47
PS7, Line 47: ,
nit: remove comma, maybe replace with "that"


http://gerrit.cloudera.org:8080/#/c/11263/7/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@82
PS7, Line 82:
: Conclusion
: ==
No where does this mention that, as implemented, this works for equality 
predicates. Should probably mention that.


http://gerrit.cloudera.org:8080/#/c/11263/7/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@98
PS7, Line 98: full fledged
nit: full-fledged


http://gerrit.cloudera.org:8080/#/c/11263/7/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@99
PS7, Line 99: skip scan approach
nit: "of the skip scan approach" or "of the skip scan optimization"



--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 7
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Thu, 13 Sep 2018 17:57:03 +
Gerrit-HasComments: Yes


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-12 Thread Anupama Gupta (Code Review)
Anupama Gupta has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 7:

(13 comments)

Many thanks for the comments. Please take a look.

http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md:

http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@39
PS6, Line 39: table
> nit: tablet, here and elsewhere
Done


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@45
PS6, Line 45: (we will refer to it as
: "prefix column" and its specific value as "prefix key").
> nit: since we're not using these as a variable names, but rather as definit
Done


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@48
PS6, Line 48: Therefore, we can use the index to skip to the rows that have 
distinct prefix keys,
: and also satisfy the predicate on the `tstamp` column.
> nit: maybe drop the ** around "skip" here, since you do it down below anywa
Done


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@58
PS6, Line 58: `host = helium`
> nit: would be nice if the entire thing were in backticks, since it's a cond
You are right. Made the change.


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@60
PS6, Line 60: satisfy the predicate, and we
> nit: probably not needed
Done


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@59
PS6, Line 59: . At that
: point we would
> nit: reword "until the predicate no longer matches. At that point we would
Done


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@67
PS6, Line 67: tio
> nit: this is a little distracting, below too. Let's just keep it singular s
Done


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@73
PS6, Line 73: o get worse with respect to the full tablet scan performance when 
the prefix column cardinality
> This and below are being rendered weirdly by github. Would like to confirm
Yes, it works fine with jekyll. (Link to this screen shot - 
https://raw.githubusercontent.com/AnupamaGupta01/kudu-1/gh-pages-staging/img/index-skip-scan/equation.png)


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@74
PS6, Line 74: C%20tablet%20%7D).
: Therefore, in order to use skip scan perf
> "consistent performance in cases of large prefix column cardinality"
Done


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@93
PS6, Line 93:
> nit: probably not needed, below too.
Done


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@94
PS6, Line 94: Range pr
> nit: In-list
Done


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@98
PS6, Line 98: orki
> nit: team
Done


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@101
PS6, Line 101:
 : References
 : ==
 :
 : 
[[1]](https://storage.googleapis.com/pub-tools-public-publication-data/pdf/42851.pdf):
 Gupta, Ashish, et al. "Mesa:
 : Geo-replicated, near real-time, scalable data warehousing." 
Proceedings of the VLDB Endowment 7.12 (2014): 1259-1270.
 :
 : [[2]](https://oracle-base.com/articles/9i/index-skip-scanning/): 
Index Skip Scanning - Oracle Database
 :
> It's really up to you, but WDYT about just linking these in-line? This is a
Thanks, I see your point. I think that the current section for references looks 
fine after incorporating Alexey's suggestions on the same (in Patch 4, L62).



--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 7
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Thu, 13 Sep 2018 03:35:44 +
Gerrit-HasComments: Yes


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-12 Thread Anupama Gupta (Code Review)
Hello Alexey Serbin, Mike Percy, Attila Bukor, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11263

to look at the new patch set (#7).

Change subject: Blogpost describing index skip scan optimization.
..

Blogpost describing index skip scan optimization.

Link to the version with images:
https://github.com/AnupamaGupta01/kudu/blob/blogpost-2/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md

Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
---
A _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
A img/index-skip-scan/example-table.png
A img/index-skip-scan/skip-scan-example-table.png
A img/index-skip-scan/skip-scan-performance-graph.png
4 files changed, 112 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/63/11263/7
--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 7
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Mike Percy 


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-11 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 6:

(14 comments)

Almost all nits pretty much. Looking good!

http://gerrit.cloudera.org:8080/#/c/11263/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11263/6//COMMIT_MSG@9
PS6, Line 9: Link to the version with images:
Very nice :)


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md:

http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@39
PS6, Line 39: table
nit: tablet, here and elsewhere


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@45
PS6, Line 45: (we will refer to it as
: `prefix column` and it's specific value as `prefix key`)
nit: since we're not using these as a variable names, but rather as 
definitions, we should use quotations. Also drop the apostrophe in "its". I.e:

we will refer to it as the "prefix column" and its specific value as the 
"prefix key"


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@48
PS6, Line 48: Therefore, we can use the index to **skip** to the rows that have 
distinct prefix keys,
: and also satisfy the predicate on the `tstamp` column.
nit: maybe drop the ** around "skip" here, since you do it down below anyway.


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@58
PS6, Line 58: `host` = helium
nit: would be nice if the entire thing were in backticks, since it's a 
condition? Seems a little awkward, this mix of backticks and no backticks. WDYT?


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@60
PS6, Line 60:  such as `ubuntu`, `westeros`
nit: probably not needed


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@59
PS6, Line 59:  with
: this prefix key
nit: reword "until the predicate no longer matches. At that point we would know 
that no more rows with `host = helium` will satisfy the predicate, and we can 
skip to the next prefix key.


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@67
PS6, Line 67: (s)
nit: this is a little distracting, below too. Let's just keep it singular since 
you call out at the end that it can be any number of prefix columns at the end 
anyway.


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@73
PS6, Line 73: 
![](http://latex.codecogs.com/gif.download?%5Csqrt%20%7B%20%5C%23rows%5C%20in%5C%20tablet%20%7D).
This and below are being rendered weirdly by github. Would like to confirm this 
doesn't happen with jekyll


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@74
PS6, Line 74: consistent performance with
: respect to the prefix columns cardinality
"consistent performance in cases of large prefix column cardinality"


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@93
PS6, Line 93:  on the non-first key columns(s)
nit: probably not needed, below too.


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@94
PS6, Line 94: IN list
nit: In-list


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@98
PS6, Line 98: Team
nit: team


http://gerrit.cloudera.org:8080/#/c/11263/6/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@101
PS6, Line 101: References
 : ==
 :
 : 
[[1]](https://storage.googleapis.com/pub-tools-public-publication-data/pdf/42851.pdf):
 Gupta, Ashish, et al. "Mesa:
 : Geo-replicated, near real-time, scalable data warehousing." 
Proceedings of the VLDB Endowment 7.12 (2014): 1259-1270.
 :
 : [[2]](https://oracle-base.com/articles/9i/index-skip-scanning/): 
Index Skip Scanning - Oracle Database
 :
 : [[3]](https://www.sqlite.org/optoverview.html#skipscan): Skip 
Scan - SQLite
It's really up to you, but WDYT about just linking these in-line? This is a 
webpage after all :)



--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 6
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: 

[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-09 Thread Anupama Gupta (Code Review)
Anupama Gupta has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11263/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11263/4//COMMIT_MSG@6
PS4, Line 6:
   : Blogpost describing index skip scan optimization.
   :
> Thanks for this Andrew. I am still not sure why images are not getting rend
Resolved this issue now. Added a link to the rendered version.



--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 6
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Sun, 09 Sep 2018 20:13:26 +
Gerrit-HasComments: Yes


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-09 Thread Anupama Gupta (Code Review)
Anupama Gupta has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 6:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md:

http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@73
PS4, Line 73: exceeds 
![](http://latex.codecogs.com/gif.download?%5Csqrt%20%7B%20%5C%23rows%5C%20in%5C%20tablet%20%7D).
: Therefore, in order to use skip scan performance benefits when 
possible and maintain a consistent performance with
> I think it's the number of rows in the CFileSet, which I think is also the
You are right ! How about rewording this to "rows in tablet" ?


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md:

http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@9
PS5, Line 9: index skip scan (a.k.
> It's great that you found another reference to the same idea in the google'
Sounds good. Done.


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@13
PS5, Line 13: Let's b
> nit: probably don't need this
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@40
PS5, Line 40:  an option
> nit: probably don't need this
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@38
PS5, Line 38:
: Instead, a full table scan is done by default. Other databases 
may optimize such scans by building secondary indexes
: (though it might be redundant to build one on one of the
> Let's stick with a single concrete example, say `tstamp`. Then we can point
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@41
PS5, Line 41: given its lack of secondary index support.
: 
: The question is, can Kudu do better than a full table scan here?
:
> nit: I think this would read better after L45. E.g.
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@47
PS5, Line 47: in the in
> nit: since this is a concrete example, we know there is only one column bef
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@50
PS5, Line 50:
: {% highlight SQL %}
> nit: reword as "to **skip** to the rows that have distinct prefix keys, and
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@61
PS5, Line 61: ce, this metho
> nit: "Kudu tablet" or "tablet server" or "Kudu"
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@61
PS5, Line 61: as **skip scan optimization**[2-3].
:
: Performance
> Maybe reverse the order of **skip** and **scan**, since the name is "skip s
Done. You are correct, I have rephrased the sentence to better clarify this 
point.


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@70
PS5, Line 70:
> nit: add "the" in front of "Lower" and "better"
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@71
PS5, Line 71: nts, on up to 10 million rows per t
> I seem to recall a plot that showed the performance without the dynamic dis
That's a good point. Unfortunately, I do not have the backup of that slide.


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@77
PS5, Line 77:
> nit: skips? for consistency with the "skip" and "scan" terminology
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@77
PS5, Line 77: It will be an in
> nit: I think it's clear enough that this may refer to multiple, so maybe ju
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@89
PS5, Line 89:
> nit: one (`host`)
Done


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@104
PS5, Line 104: [[1
> Do you feel good about adding one more reference?  I think https://www.sqli
Thanks so much for this suggestion. I have added this reference too.


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@105
PS5, Line 105: Geo-
> nit: usually in the reference section they use '[x]' where it's possible to
Thank you for pointing this out. Done.



--
To view, visit 

[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-09 Thread Anupama Gupta (Code Review)
Hello Alexey Serbin, Mike Percy, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11263

to look at the new patch set (#6).

Change subject: Blogpost describing index skip scan optimization.
..

Blogpost describing index skip scan optimization.

Link to the version with images:
https://github.com/AnupamaGupta01/kudu/blob/blogpost-2/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md

Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
---
A _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
A img/index-skip-scan/example-table.png
A img/index-skip-scan/skip-scan-example-table.png
A img/index-skip-scan/skip-scan-performance-graph.png
4 files changed, 111 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/63/11263/6
--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 6
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Mike Percy 


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-04 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 5:

(4 comments)

Great progress!  Some more nits in addition to what Andrew already pointed at.

http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md:

http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@73
PS4, Line 73:
: Based on our experiments, on up to 10 million rows per tablet (as 
shown below), we found that the skip scan performa
> 1) Yes, these experiments were based on table schema and query pattern ment
I see.  Thank you for the information.


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md:

http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@9
PS5, Line 9: [index skip scan][1].
It's great that you found another reference to the same idea in the google's 
paper [2].  Do you think it's worth mentioning the other name for the 
technique?  Something like 'index skip scan (a.k.a. scan-to-seek, see section 
4.1 in [2]).


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@104
PS5, Line 104: [1]
Do you feel good about adding one more reference?  I think 
https://www.sqlite.org/optoverview.html#skipscan is also a good one.  But it's 
up to you.


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@105
PS5, Line 105: [2]:
nit: usually in the reference section they use '[x]' where it's possible to 
follow the link simply by clicking on it.  To enable those square brackets to 
appear in the rendered output, you need to duplicate them, e.g.

[[1]](https://my.url.io/) Mega-turbo resource

Or you want them to be just numbers followed by columns?



--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 5
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Tue, 04 Sep 2018 22:33:55 +
Gerrit-HasComments: Yes


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-04 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 5:

> Patch Set 5:
>
> (14 comments)
>
> Hrm, I'm not sure why it's not rendering on github for you. Maybe post a 
> screenshot of the rendered jekyll? That'd be helpful too.

P.S. thanks for updating this! Looking much better so far :)


--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 5
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Tue, 04 Sep 2018 19:43:14 +
Gerrit-HasComments: No


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-04 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 5:

(14 comments)

Hrm, I'm not sure why it's not rendering on github for you. Maybe post a 
screenshot of the rendered jekyll? That'd be helpful too.

http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md:

http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@73
PS4, Line 73:
: Based on our experiments, on up to 10 million rows per tablet (as 
shown below), we found that the skip scan performa
> Added explanation about how we came to using this simple heuristic. Yes, it
I think it's the number of rows in the CFileSet, which I think is also the 
number of rows in the b-tree, but it isn't equal to the number of rows in the 
table (since that spans multiple tablets).


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md:

http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@13
PS5, Line 13: Example
nit: probably don't need this


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@38
PS5, Line 38: first key column(s)
: (`tstamp` and/or `clusterid`)? In this case, since the column 
value might be present anywhere in the index structure,
: the current query execution plan does not use the index.
Let's stick with a single concrete example, say `tstamp`. Then we can point to 
the example above:
"In the above case, the `tsamp` columns are sorted with respect to `host`, but 
are not globally sorted, and as such, it's non-trivial to use the index to 
filter rows.


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@40
PS5, Line 40:  by default
nit: probably don't need this


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@41
PS5, Line 41: To optimize this scan time, a possible solution is to build 
secondary index on the required key column (although, it might be
: redundant to build secondary index on composite key column).
: However, we do not consider this solution as Kudu does not 
support secondary indexes yet.
:
nit: I think this would read better after L45. E.g.

Other databases may optimize such scans by build secondary indexes (though it 
might be redundant to build one on one of the primary keys). However, this 
isn't an option for Kudu, given its lack of secondary index support. The 
question is, can Kudu do better than a full table scan here?


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@47
PS5, Line 47: column(s)
nit: since this is a concrete example, we know there is only one column before 
`tsamp`


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@50
PS5, Line 50: seek to the rows containing distinct prefix keys
: and satisfying the query predicate on the `tstamp` column.
nit: reword as "to **skip** to the rows that have distinct prefix keys, and 
also satisfy the predicate on the `tsamp` column."


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@61
PS5, Line 61:  query server
nit: "Kudu tablet" or "tablet server" or "Kudu"


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@61
PS5, Line 61: **scan** all rows for which `host` = `helium` and `tstamp` = 100 
and consequently,
: **skip** all the rows for which host = `helium` and `tstamp` != 
100
: (holds true for all distinct keys of `host` such as `ubuntu`, 
`westeros`).
Maybe reverse the order of **skip** and **scan**, since the name is "skip 
scan"? Also isn't the actual order is to skip to a distinct prefix that may 
match a predicate, and then scan through rows until we know that the rows won't 
match the predicate within this prefix key?


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@70
PS5, Line 70: Lower the prefix column cardinality, better the skip scan 
performance
nit: add "the" in front of "Lower" and "better"


http://gerrit.cloudera.org:8080/#/c/11263/5/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@71
PS5, Line 71: skip scan is not a viable approach.
I seem to recall a plot that showed the performance without the dynamic 
disabling functionality. Do you still have that around? I think that would be 
interesting to put up since it exemplifies this 

[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-03 Thread Anupama Gupta (Code Review)
Anupama Gupta has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 5:

(22 comments)

Please take a look.

http://gerrit.cloudera.org:8080/#/c/11263/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11263/4//COMMIT_MSG@6
PS4, Line 6:
   : Blogpost describing index skip scan optimization.
   :
> In reviewing blogposts, it's generally helpful to post a link to a rendered
Thanks for this Andrew. I am still not sure why images are not getting rendered 
here 
-https://github.com/AnupamaGupta01/kudu/blob/blogpost-2/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md.
Although I do see the rendered version locally, using jekyll.

Please let me know if I am missing something.


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md:

http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@9
PS4, Line 9: [index skip scan][1].
> This already seems like it's going a bit too far into implementation detail
Got it. Done.


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@11
PS4, Line 11: 
:
> Probably don't need this.
Done


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@14
PS4, Line 14:
> What do these do?
This is used to show the beginning excerpts of the post. I misunderstood 
earlier that it is used for newline. Moved this tag after the beginning  two 
lines and removed this from elsewhere.


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@31
PS4, Line 31:
> Maybe, add a reference (like https://en.wikipedia.org/wiki/B-tree) in-line
Makes sense. Added an in-line reference.


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@31
PS4, Line 31: 
> nit: "a B-tree", and no need to capitalize "Tree" below
Done


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@33
PS4, Line 33:  `metric
> nit: perhaps using ``s would be more reasonable here (ie. `host`). Then it'
Done


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@32
PS4, Line 32: In this case, by default, Kudu internally builds a primary key 
index (implemented as a
: [B-tree](https://en.wikipedia.org/wiki/B-tree)) for the table 
`metrics`.
: As shown in the table above, the ind
> IMO this doesn't convey the idea that the data is sorted by the composite o
Thanks for this suggestion. I moved the example dataset from below and used it 
as a reference to elaborate on the points you mentioned.


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@32
PS4, Line 32: In this case, by default, Kudu internally builds a primary key 
index (implemented as a
: [B-tree](https://en.wikipedia.org/wiki/B-tree)) for the table 
`metrics`.
: As shown in the table above, the ind
> +1 all points mentioned by Andrew here.
Done


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@39
PS4, Line 39: ?
> nit: here and elsewhere, no need for spaces before punctuation marks
Done


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@40
PS4, Line 40: the current query execution plan does not use the index. Instead, 
a full tab
> I'm not sure this gives a clear explanation as for the reason to perform a
Rephrased this paragraph to clarify this point.


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@45
PS4, Line 45: The question is,
> In general, I think the index skip scan optimization is not the only answer
Done


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@46
PS4, Line 46:
> The crux of this is the prefixes are also sorted, and all rows of a given p
Done


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@53
PS4, Line 53: For example, consider the query:
> nit: maybe, to be in sync with the CREATE TABLE statement above, write SQL
Done


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@51
PS4, Line 51: and satisfying the query predicate on the `tstamp` column.
:
: For example, consider the query:
: {% highlight SQL %}
: SELECT clusterid FROM metrics WHERE tstamp = 100;
: {% endhighlight %}
:
> Ah, so you _do_ have an example! I 

[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-09-03 Thread Anupama Gupta (Code Review)
Hello Alexey Serbin, Mike Percy, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11263

to look at the new patch set (#5).

Change subject: Blogpost describing index skip scan optimization.
..

Blogpost describing index skip scan optimization.

Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
---
A _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
A img/index-skip-scan/example-table.png
A img/index-skip-scan/skip-scan-example-table.png
A img/index-skip-scan/skip-scan-performance-graph.png
4 files changed, 106 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/63/11263/5
--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 5
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Mike Percy 


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-08-30 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 4:

I like the article. One thing I think we should so is mention that this is a 
work-in-progress patch and link to the Gerrit review so people can follow along 
if they want.


--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 4
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Thu, 30 Aug 2018 20:12:47 +
Gerrit-HasComments: No


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-08-29 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 4:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md:

http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@31
PS4, Line 31: B-Tree
Maybe, add a reference (like https://en.wikipedia.org/wiki/B-tree) in-line or 
in a separate 'References' section?


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@32
PS4, Line 32: The data is sorted lexicographically starting from the leftmost 
primary key column and stored in the B-Tree leaf nodes.
: Therefore, when the user query contains the first key column 
("host"), Kudu uses the primary key range push down
: operation to optimize the scan time.
> IMO this doesn't convey the idea that the data is sorted by the composite o
+1 all points mentioned by Andrew here.


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@40
PS4, Line 40: (since the primary key index is sorted on the basis of the first 
key column)
I'm not sure this gives a clear explanation as for the reason to perform a full 
table scan.  Could you update this to explain why simply using the primary 
index we cannot instantly locate the desired rows?


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@45
PS4, Line 45: The answer is yes
In general, I think the index skip scan optimization is not the only answer.  
In other databases it's possible to build secondary indices, and that might 
work even better (of course it depends on the read/write ratio for the use-case 
and availability of space to build additional index).

I think it's worth mentioning that building secondary index would not be the 
option here since Kudu does not support secondary indices yet.


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@53
PS4, Line 53: select clusterid from metrics where tstamp = 100
nit: maybe, to be in sync with the CREATE TABLE statement above, write SQL 
keywords in capital letters.


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@62
PS4, Line 62: popularly known as index skip scan optimization can skip all the 
rows for which host = "helium" and tstamp != 100
> nit: it's great to get to the point that we call this a "skip scan". To dri
Maybe, it's worth mentioning 'skip scan' earlier where you give a short 
overview of the idea behind the skip scan optimization.  Also, as for 
addressing the 'popularity' of the term, I think that adding some references in 
a separate section for various databases that implement that optimization might 
be useful (e.g., one of those links might be 
https://oracle-base.com/articles/9i/index-skip-scanning).


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@73
PS4, Line 73: Based on experiments on upto 10 million rows per tablet, we 
decided to disable skip scan when the number of seeks
: for distinct prefix column values exceeds 
![](https://latex.codecogs.com/gif.latex?%5Csqrt%7B%5C%23total%20rows%7D).
> This could use some explanation as to why sqrt(total_num_rows) was chosen.
Yep, it would be nice to add some details around the data and reasoning backing 
the choice of this disable-skip-scan criterion.

1) As for those experiments, were those using the table schema and query 
pattern mentioned above?  Or those experiments involved some other table 
schemas and query patterns?
2) What was the rationale at the conceptual level to choose that sqrt() metric?
3) If there were multiple candidate criteria to choose from, maybe it's worth 
mentioning those as well?
4) If 3 is true, was the sqrt() criteria a clear winner or there was some 
fuziness and the sqrt() was chosen also because it looks simpler comparing to 
others?


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@78
PS4, Line 78: The performance graph of this approach is shown below
This is for the schema and query pattern mentioned earlier, right?  Maybe, it's 
worth mentioning that.



-- 
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 4
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Comment-Date: Wed, 29 Aug 2018 

[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-08-29 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..


Patch Set 4:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/11263/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11263/4//COMMIT_MSG@6
PS4, Line 6:
   : Blogpost describing index skip scan optimization.
   :
In reviewing blogposts, it's generally helpful to post a link to a rendered 
version, e.g. posting to your own github, which will automatically render the 
*.md


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
File _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md:

http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@9
PS4, Line 9: does not contain the first column of the composite (multi-column) 
primary key.
This already seems like it's going a bit too far into implementation details. 
Maybe instead note something like: 'I optimized the Kudu scan-path by 
implementing a technique called an "index-skip scan."'


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@11
PS4, Line 11: Example
: ==
Probably don't need this.


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@14
PS4, Line 14: 
What do these do?


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@31
PS4, Line 31: as B-Tree
nit: "a B-tree", and no need to capitalize "Tree" below


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@33
PS4, Line 33: ("host")
nit: perhaps using ``s would be more reasonable here (ie. `host`). Then it'd be 
formatted as monospace font. Here and elsewhere


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@32
PS4, Line 32: The data is sorted lexicographically starting from the leftmost 
primary key column and stored in the B-Tree leaf nodes.
: Therefore, when the user query contains the first key column 
("host"), Kudu uses the primary key range push down
: operation to optimize the scan time.
IMO this doesn't convey the idea that the data is sorted by the composite of 
all primary key columns. Also not sure what you mean by "primary key range push 
down operation".

Also, overall for this project, I think it's always been helpful to 
think/reason about it with some example data. I think having an dummy dataset 
of a handful of rows with a decent number of prefix keys would make this 
blogpost more understandable to the layperson. It'd also serve as a concrete 
example of why we can't use the PK index if the predicate doesn't contain the 
first key.


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@39
PS4, Line 39:
nit: here and elsewhere, no need for spaces before punctuation marks


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@46
PS4, Line 46: form a prefix.
The crux of this is the prefixes are also sorted, and all rows of a given 
prefix are also sorted by the remaining PK columns. A prefix with no other 
properties isn't necessarily useful, so without calling that out, it might be 
hard to see why having these prefixes are helpful.


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@51
PS4, Line 51: For example, consider the query :
: {% highlight SQL %}
: select clusterid from metrics where tstamp = 100;
: {% endhighlight %}
:
: ![png]({{ site.github.url 
}}/img/index-skip-scan/skip-scan-example-table.png){:height="500px" 
width="500px" .img-responsive}
: *Sample rows of Table "metrics" (sorted by key columns for 
simplicity).*
Ah, so you _do_ have an example! I think it'd be helpful setting this up up 
front, saying, here is how data is organized in Kudu today, and based on that, 
why it's not straightforward to use the index when there aren't predicates on 
the first primary key, etc.

Also isn't the data _actually_ stored like this? I.e. not for simplicity, but 
this actually represents how Kudu would see the data, doesn't it?


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@61
PS4, Line 61: host = "helium"
nit: backticks here too and everywhere else that has code snippets


http://gerrit.cloudera.org:8080/#/c/11263/4/_posts/2018-08-17-index-skip-scan-optimization-in-kudu.md@62
PS4, Line 62: popularly known as index skip scan optimization can skip all the 
rows for which host = "helium" and tstamp != 100
nit: it's great to get to the point that we 

[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-08-24 Thread Anupama Gupta (Code Review)
Hello Alexey Serbin,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11263

to look at the new patch set (#4).

Change subject: Blogpost describing index skip scan optimization.
..

Blogpost describing index skip scan optimization.

Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
---
A _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
A img/index-skip-scan/skip-scan-example-table.png
A img/index-skip-scan/skip-scan-performance-graph.png
3 files changed, 95 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/63/11263/4
--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 4
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 


[kudu-CR](gh-pages) Blogpost describing index skip scan optimization.

2018-08-17 Thread Anupama Gupta (Code Review)
Anupama Gupta has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/11263 )

Change subject: Blogpost describing index skip scan optimization.
..

Blogpost describing index skip scan optimization.

Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
---
A _posts/2018-08-17-index-skip-scan-optimization-in-kudu.md
A img/index-skip-scan/skip-scan-example-table.png
A img/index-skip-scan/skip-scan-performance-graph.png
3 files changed, 64 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/63/11263/3
--
To view, visit http://gerrit.cloudera.org:8080/11263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2250652dcba3d1b0a06f1ffb7f23c11bf533d35e
Gerrit-Change-Number: 11263
Gerrit-PatchSet: 3
Gerrit-Owner: Anupama Gupta