Hello Andrew Wong, Grant Henke,
I'd like you to do a code review. Please visit
http://gerrit.cloudera.org:8080/15678
to review the following change.
Change subject: Avoid calling Schema::find_column() once per RowBlock in
columnar serialization
......................................................................
Avoid calling Schema::find_column() once per RowBlock in columnar serialization
Prior to this patch, each row block being serialized in the columnar
format would result in a call to Schema::find_column(name) for each
projected column. That was relatively expensive, involving a hash
computation and string equality check, etc.
This changes the projection calculation to happen "up front" once per
Scan RPC and per-rowblock calls.
This optimization could also apply to the rowwise serialization, but I
found that the other overheads inherent in that code path are so high
that the find_column calls aren't particularly noticeable. Nonetheless
I left a TODO.
Change-Id: I1b683c7d6d6fe1026ee06c8b5ebfe2a5f1ee6cb1
---
M src/kudu/common/columnar_serialization.cc
M src/kudu/common/columnar_serialization.h
M src/kudu/common/wire_protocol-test.cc
M src/kudu/tserver/tablet_service.cc
4 files changed, 108 insertions(+), 79 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/78/15678/1
--
To view, visit http://gerrit.cloudera.org:8080/15678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I1b683c7d6d6fe1026ee06c8b5ebfe2a5f1ee6cb1
Gerrit-Change-Number: 15678
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>