Todd Lipcon has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/13666 )
Change subject: KUDU-2866. Optimize CFileSet::Iterator::FinishBatch ...................................................................... KUDU-2866. Optimize CFileSet::Iterator::FinishBatch I was running SELECT * FROM wide_table WHERE col = <non-matching> and found that 10% of the CPU was used in this function. Looking at the disassembly, it seemed that most of the cycles were going to iteration over the vector<bool> (bitset) of prepared columns. In this case, only one column was prepared (the one with the predicate) so the iteration and bitset-testing was a big waste of CPU. In fact, any given column will only be prepared once, so we can just keep a simple list of the prepared columns and iterate over those explicitly, making the loop O(num prepared columns) instead of O(columns). This dropped CFileSet::Iterator::FinishBatch() from taking 12.4% of the scanner CPU down to about 0.2%. Change-Id: I997fe832fdfa8d92fbbcb5d7c5bd4141e485b4f8 Reviewed-on: http://gerrit.cloudera.org:8080/13666 Tested-by: Kudu Jenkins Reviewed-by: Alexey Serbin <[email protected]> --- M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h 2 files changed, 16 insertions(+), 22 deletions(-) Approvals: Kudu Jenkins: Verified Alexey Serbin: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/13666 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I997fe832fdfa8d92fbbcb5d7c5bd4141e485b4f8 Gerrit-Change-Number: 13666 Gerrit-PatchSet: 4 Gerrit-Owner: Todd Lipcon <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Todd Lipcon <[email protected]>
