[kudu-CR] generic iterators: prep for MergeIterator dominance

2019-01-24 Thread Adar Dembo (Code Review)
Adar Dembo has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/12196 )

Change subject: generic_iterators: prep for MergeIterator dominance
..

generic_iterators: prep for MergeIterator dominance

This patch adds a counter to the MergeIterator that tracks the number of
comparisons made during the lifetime of the iterator. When the dominance
algorithm is added, the counter's value ought to drop. TestMerge now logs
the counter's value, and can also generate non-overlapped inputs if desired.

The counter isn't exposed to users. It could be added to IteratorStats and
set for every key column, but even then it'll only apply to ORDERED scans,
so the value is dubious.

Change-Id: I3a8db5973dc09b7271e05b3cc28025b7a2a9e21b
Reviewed-on: http://gerrit.cloudera.org:8080/12196
Reviewed-by: Mike Percy 
Tested-by: Kudu Jenkins
---
M src/kudu/common/generic_iterators-test.cc
M src/kudu/common/generic_iterators.cc
M src/kudu/common/generic_iterators.h
3 files changed, 49 insertions(+), 7 deletions(-)

Approvals:
  Mike Percy: Looks good to me, approved
  Kudu Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/12196
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I3a8db5973dc09b7271e05b3cc28025b7a2a9e21b
Gerrit-Change-Number: 12196
Gerrit-PatchSet: 9
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] generic iterators: prep for MergeIterator dominance

2019-01-24 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12196 )

Change subject: generic_iterators: prep for MergeIterator dominance
..


Patch Set 8: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/12196
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8db5973dc09b7271e05b3cc28025b7a2a9e21b
Gerrit-Change-Number: 12196
Gerrit-PatchSet: 8
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 25 Jan 2019 01:00:30 +
Gerrit-HasComments: No


[kudu-CR] generic iterators: prep for MergeIterator dominance

2019-01-24 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12196 )

Change subject: generic_iterators: prep for MergeIterator dominance
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/12196
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8db5973dc09b7271e05b3cc28025b7a2a9e21b
Gerrit-Change-Number: 12196
Gerrit-PatchSet: 7
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 25 Jan 2019 00:56:11 +
Gerrit-HasComments: No


[kudu-CR] generic iterators: prep for MergeIterator dominance

2019-01-23 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12196 )

Change subject: generic_iterators: prep for MergeIterator dominance
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/12196
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8db5973dc09b7271e05b3cc28025b7a2a9e21b
Gerrit-Change-Number: 12196
Gerrit-PatchSet: 6
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 24 Jan 2019 01:05:10 +
Gerrit-HasComments: No


[kudu-CR] generic iterators: prep for MergeIterator dominance

2019-01-11 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12196 )

Change subject: generic_iterators: prep for MergeIterator dominance
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12196/1/src/kudu/common/generic_iterators.h
File src/kudu/common/generic_iterators.h:

http://gerrit.cloudera.org:8080/#/c/12196/1/src/kudu/common/generic_iterators.h@184
PS1, Line 184:
> I don't know; I didn't measure the impact of the histogram itself. Looking
yea I think a single counter should be low enough overhead that i'm not too 
concerned, especially considering we expect your improvements to have a really 
large perf improvement associated



--
To view, visit http://gerrit.cloudera.org:8080/12196
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8db5973dc09b7271e05b3cc28025b7a2a9e21b
Gerrit-Change-Number: 12196
Gerrit-PatchSet: 5
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Sat, 12 Jan 2019 00:28:36 +
Gerrit-HasComments: Yes


[kudu-CR] generic iterators: prep for MergeIterator dominance

2019-01-11 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12196 )

Change subject: generic_iterators: prep for MergeIterator dominance
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12196/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12196/2//COMMIT_MSG@9
PS2, Line 9: This patch adds a histogram to the MergeIterator that tracks the 
number of
> instead of a histogram isn't what we really care about just the total numbe
I originally went with a histogram because in addition to the total number of 
comparisons, I also wanted to get a sense for how they were distributed across 
the lifetime of the MergeIterator. Spitting out the percentiles as I did 
doesn't really accomplish that, nor does the distribution really matter for the 
purposes of computing the increase in efficiency (we assume that every scan 
runs to completion, so the total CPU consumption is the only thing that really 
matters).

Anyway, I'll switch this to a raw counter.


http://gerrit.cloudera.org:8080/#/c/12196/1/src/kudu/common/generic_iterators.h
File src/kudu/common/generic_iterators.h:

http://gerrit.cloudera.org:8080/#/c/12196/1/src/kudu/common/generic_iterators.h@184
PS1, Line 184:   HdrHistogram sub_iter_histo_;
> I see the appeal of this for your testing, but does this have adverse perf
I don't know; I didn't measure the impact of the histogram itself. Looking at 
the implementation, it's conceivable that it may have an impact, as it does 
some arithmetic, atomic increments, and CASes.

Are you open to a counter, even if it was only used in tests? IteratorStats 
would probably be the best home, but those are maintained on a per-column 
basis; "number of key comparisons" doesn't really make sense in that context, 
unless you want it broken up by key column. And beyond that, it's a metric that 
very clearly only matters if using ORDERED scans.



--
To view, visit http://gerrit.cloudera.org:8080/12196
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8db5973dc09b7271e05b3cc28025b7a2a9e21b
Gerrit-Change-Number: 12196
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Fri, 11 Jan 2019 23:53:50 +
Gerrit-HasComments: Yes


[kudu-CR] generic iterators: prep for MergeIterator dominance

2019-01-11 Thread Adar Dembo (Code Review)
Hello Mike Percy, Kudu Jenkins, Grant Henke, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12196

to look at the new patch set (#5).

Change subject: generic_iterators: prep for MergeIterator dominance
..

generic_iterators: prep for MergeIterator dominance

This patch adds a counter to the MergeIterator that tracks the number of
comparisons made during the lifetime of the iterator. When the dominance
algorithm is added, the counter's value ought to drop. TestMerge now logs
the counter's value, and can also generate non-overlapped inputs if desired.

The counter isn't exposed to users. It could be added to IteratorStats and
set for every key column, but even then it'll only apply to ORDERED scans,
so the value is dubious.

Change-Id: I3a8db5973dc09b7271e05b3cc28025b7a2a9e21b
---
M src/kudu/common/generic_iterators-test.cc
M src/kudu/common/generic_iterators.cc
M src/kudu/common/generic_iterators.h
3 files changed, 49 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/96/12196/5
--
To view, visit http://gerrit.cloudera.org:8080/12196
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3a8db5973dc09b7271e05b3cc28025b7a2a9e21b
Gerrit-Change-Number: 12196
Gerrit-PatchSet: 5
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] generic iterators: prep for MergeIterator dominance

2019-01-09 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12196 )

Change subject: generic_iterators: prep for MergeIterator dominance
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12196/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12196/2//COMMIT_MSG@9
PS2, Line 9: This patch adds a histogram to the MergeIterator that tracks the 
number of
instead of a histogram isn't what we really care about just the total number of 
Compare calls? ie the point of the dominance stuff is just to avoid comparisons 
(and thus avoid CPU)


http://gerrit.cloudera.org:8080/#/c/12196/1/src/kudu/common/generic_iterators.h
File src/kudu/common/generic_iterators.h:

http://gerrit.cloudera.org:8080/#/c/12196/1/src/kudu/common/generic_iterators.h@184
PS1, Line 184:   HdrHistogram sub_iter_histo_;
I see the appeal of this for your testing, but does this have adverse perf 
impact? Maybe we should have a clear idea of how this would be exposed (and 
useful) to users before including it?



--
To view, visit http://gerrit.cloudera.org:8080/12196
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a8db5973dc09b7271e05b3cc28025b7a2a9e21b
Gerrit-Change-Number: 12196
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 09 Jan 2019 20:58:28 +
Gerrit-HasComments: Yes


[kudu-CR] generic iterators: prep for MergeIterator dominance

2019-01-09 Thread Adar Dembo (Code Review)
Hello Mike Percy, Grant Henke, Todd Lipcon,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/12196

to review the following change.


Change subject: generic_iterators: prep for MergeIterator dominance
..

generic_iterators: prep for MergeIterator dominance

This patch adds a histogram to the MergeIterator that tracks the number of
sub-iterators available for merging whenever a row is copied. When the
dominance algorithm is added, this histogram ought to demonstrate an
improvement in the overall number of comparisons performed during the merge.

TestMerge has been modified to log the histogram results, and to allow
non-overlapped inputs to be generated.

I don't know whether a histogram is the best way to capture this data, but
it's the first thing that came to mind so I rolled with it. It'd be nice to
expose it as a Kudu metric, but it's not clear how to plumb that.

Change-Id: I3a8db5973dc09b7271e05b3cc28025b7a2a9e21b
---
M src/kudu/common/generic_iterators-test.cc
M src/kudu/common/generic_iterators.cc
M src/kudu/common/generic_iterators.h
3 files changed, 44 insertions(+), 5 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/96/12196/1
--
To view, visit http://gerrit.cloudera.org:8080/12196
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I3a8db5973dc09b7271e05b3cc28025b7a2a9e21b
Gerrit-Change-Number: 12196
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon