Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Grant Henke,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13666

to look at the new patch set (#3).

Change subject: KUDU-2866. Optimize CFileSet::Iterator::FinishBatch
......................................................................

KUDU-2866. Optimize CFileSet::Iterator::FinishBatch

I was running SELECT * FROM wide_table WHERE col = <non-matching>
and found that 10% of the CPU was used in this function. Looking at the
disassembly, it seemed that most of the cycles were going to iteration
over the vector<bool> (bitset) of prepared columns. In this case, only
one column was prepared (the one with the predicate) so the iteration
and bitset-testing was a big waste of CPU.

In fact, any given column will only be prepared once, so we can just
keep a simple list of the prepared columns and iterate over those
explicitly, making the loop O(num prepared columns) instead of
O(columns).

This dropped CFileSet::Iterator::FinishBatch() from taking 12.4% of the
scanner CPU down to about 0.2%.

Change-Id: I997fe832fdfa8d92fbbcb5d7c5bd4141e485b4f8
---
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
2 files changed, 16 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/66/13666/3
--
To view, visit http://gerrit.cloudera.org:8080/13666
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I997fe832fdfa8d92fbbcb5d7c5bd4141e485b4f8
Gerrit-Change-Number: 13666
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon <[email protected]>

Reply via email to