[kudu-CR] generic iterators: prep for MergeIterator dominance
Adar Dembo has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/12196 ) Change subject: generic_iterators: prep for MergeIterator dominance .. generic_iterators: prep for MergeIterator dominance This patch adds a counter to the MergeIterator that tracks the number of comparisons made during the lifetime of the iterator. When the dominance algorithm is added, the counter's value ought to drop. TestMerge now logs the counter's value, and can also generate non-overlapped inputs if desired. The counter isn't exposed to users. It could be added to IteratorStats and set for every key column, but even then it'll only apply to ORDERED scans, so the value is dubious. Change-Id: I3a8db5973dc09b7271e05b3cc28025b7a2a9e21b Reviewed-on: http://gerrit.cloudera.org:8080/12196 Reviewed-by: Mike Percy Tested-by: Kudu Jenkins --- M src/kudu/common/generic_iterators-test.cc M src/kudu/common/generic_iterators.cc M src/kudu/common/generic_iterators.h 3 files changed, 49 insertions(+), 7 deletions(-) Approvals: Mike Percy: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/12196 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I3a8db5973dc09b7271e05b3cc28025b7a2a9e21b Gerrit-Change-Number: 12196 Gerrit-PatchSet: 9 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] generic iterators: prep for MergeIterator dominance
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/12196 ) Change subject: generic_iterators: prep for MergeIterator dominance .. Patch Set 8: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/12196 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3a8db5973dc09b7271e05b3cc28025b7a2a9e21b Gerrit-Change-Number: 12196 Gerrit-PatchSet: 8 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Fri, 25 Jan 2019 01:00:30 + Gerrit-HasComments: No
[kudu-CR] generic iterators: prep for MergeIterator dominance
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/12196 ) Change subject: generic_iterators: prep for MergeIterator dominance .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/12196 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3a8db5973dc09b7271e05b3cc28025b7a2a9e21b Gerrit-Change-Number: 12196 Gerrit-PatchSet: 7 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Fri, 25 Jan 2019 00:56:11 + Gerrit-HasComments: No
[kudu-CR] generic iterators: prep for MergeIterator dominance
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/12196 ) Change subject: generic_iterators: prep for MergeIterator dominance .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/12196 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3a8db5973dc09b7271e05b3cc28025b7a2a9e21b Gerrit-Change-Number: 12196 Gerrit-PatchSet: 6 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 24 Jan 2019 01:05:10 + Gerrit-HasComments: No
[kudu-CR] generic iterators: prep for MergeIterator dominance
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/12196 ) Change subject: generic_iterators: prep for MergeIterator dominance .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/12196/1/src/kudu/common/generic_iterators.h File src/kudu/common/generic_iterators.h: http://gerrit.cloudera.org:8080/#/c/12196/1/src/kudu/common/generic_iterators.h@184 PS1, Line 184: > I don't know; I didn't measure the impact of the histogram itself. Looking yea I think a single counter should be low enough overhead that i'm not too concerned, especially considering we expect your improvements to have a really large perf improvement associated -- To view, visit http://gerrit.cloudera.org:8080/12196 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3a8db5973dc09b7271e05b3cc28025b7a2a9e21b Gerrit-Change-Number: 12196 Gerrit-PatchSet: 5 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Sat, 12 Jan 2019 00:28:36 + Gerrit-HasComments: Yes
[kudu-CR] generic iterators: prep for MergeIterator dominance
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12196 ) Change subject: generic_iterators: prep for MergeIterator dominance .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/12196/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12196/2//COMMIT_MSG@9 PS2, Line 9: This patch adds a histogram to the MergeIterator that tracks the number of > instead of a histogram isn't what we really care about just the total numbe I originally went with a histogram because in addition to the total number of comparisons, I also wanted to get a sense for how they were distributed across the lifetime of the MergeIterator. Spitting out the percentiles as I did doesn't really accomplish that, nor does the distribution really matter for the purposes of computing the increase in efficiency (we assume that every scan runs to completion, so the total CPU consumption is the only thing that really matters). Anyway, I'll switch this to a raw counter. http://gerrit.cloudera.org:8080/#/c/12196/1/src/kudu/common/generic_iterators.h File src/kudu/common/generic_iterators.h: http://gerrit.cloudera.org:8080/#/c/12196/1/src/kudu/common/generic_iterators.h@184 PS1, Line 184: HdrHistogram sub_iter_histo_; > I see the appeal of this for your testing, but does this have adverse perf I don't know; I didn't measure the impact of the histogram itself. Looking at the implementation, it's conceivable that it may have an impact, as it does some arithmetic, atomic increments, and CASes. Are you open to a counter, even if it was only used in tests? IteratorStats would probably be the best home, but those are maintained on a per-column basis; "number of key comparisons" doesn't really make sense in that context, unless you want it broken up by key column. And beyond that, it's a metric that very clearly only matters if using ORDERED scans. -- To view, visit http://gerrit.cloudera.org:8080/12196 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3a8db5973dc09b7271e05b3cc28025b7a2a9e21b Gerrit-Change-Number: 12196 Gerrit-PatchSet: 4 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Fri, 11 Jan 2019 23:53:50 + Gerrit-HasComments: Yes
[kudu-CR] generic iterators: prep for MergeIterator dominance
Hello Mike Percy, Kudu Jenkins, Grant Henke, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12196 to look at the new patch set (#5). Change subject: generic_iterators: prep for MergeIterator dominance .. generic_iterators: prep for MergeIterator dominance This patch adds a counter to the MergeIterator that tracks the number of comparisons made during the lifetime of the iterator. When the dominance algorithm is added, the counter's value ought to drop. TestMerge now logs the counter's value, and can also generate non-overlapped inputs if desired. The counter isn't exposed to users. It could be added to IteratorStats and set for every key column, but even then it'll only apply to ORDERED scans, so the value is dubious. Change-Id: I3a8db5973dc09b7271e05b3cc28025b7a2a9e21b --- M src/kudu/common/generic_iterators-test.cc M src/kudu/common/generic_iterators.cc M src/kudu/common/generic_iterators.h 3 files changed, 49 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/96/12196/5 -- To view, visit http://gerrit.cloudera.org:8080/12196 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3a8db5973dc09b7271e05b3cc28025b7a2a9e21b Gerrit-Change-Number: 12196 Gerrit-PatchSet: 5 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon
[kudu-CR] generic iterators: prep for MergeIterator dominance
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/12196 ) Change subject: generic_iterators: prep for MergeIterator dominance .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/12196/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12196/2//COMMIT_MSG@9 PS2, Line 9: This patch adds a histogram to the MergeIterator that tracks the number of instead of a histogram isn't what we really care about just the total number of Compare calls? ie the point of the dominance stuff is just to avoid comparisons (and thus avoid CPU) http://gerrit.cloudera.org:8080/#/c/12196/1/src/kudu/common/generic_iterators.h File src/kudu/common/generic_iterators.h: http://gerrit.cloudera.org:8080/#/c/12196/1/src/kudu/common/generic_iterators.h@184 PS1, Line 184: HdrHistogram sub_iter_histo_; I see the appeal of this for your testing, but does this have adverse perf impact? Maybe we should have a clear idea of how this would be exposed (and useful) to users before including it? -- To view, visit http://gerrit.cloudera.org:8080/12196 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3a8db5973dc09b7271e05b3cc28025b7a2a9e21b Gerrit-Change-Number: 12196 Gerrit-PatchSet: 2 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 09 Jan 2019 20:58:28 + Gerrit-HasComments: Yes
[kudu-CR] generic iterators: prep for MergeIterator dominance
Hello Mike Percy, Grant Henke, Todd Lipcon, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/12196 to review the following change. Change subject: generic_iterators: prep for MergeIterator dominance .. generic_iterators: prep for MergeIterator dominance This patch adds a histogram to the MergeIterator that tracks the number of sub-iterators available for merging whenever a row is copied. When the dominance algorithm is added, this histogram ought to demonstrate an improvement in the overall number of comparisons performed during the merge. TestMerge has been modified to log the histogram results, and to allow non-overlapped inputs to be generated. I don't know whether a histogram is the best way to capture this data, but it's the first thing that came to mind so I rolled with it. It'd be nice to expose it as a Kudu metric, but it's not clear how to plumb that. Change-Id: I3a8db5973dc09b7271e05b3cc28025b7a2a9e21b --- M src/kudu/common/generic_iterators-test.cc M src/kudu/common/generic_iterators.cc M src/kudu/common/generic_iterators.h 3 files changed, 44 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/96/12196/1 -- To view, visit http://gerrit.cloudera.org:8080/12196 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I3a8db5973dc09b7271e05b3cc28025b7a2a9e21b Gerrit-Change-Number: 12196 Gerrit-PatchSet: 1 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Todd Lipcon