Hello Kudu Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/3037
to look at the new patch set (#2).
Change subject: bootstrap: filter row operations before decoding writes
......................................................................
bootstrap: filter row operations before decoding writes
This is an optimization to the bootstrap process such that, for the case where
all of the operations in a log entry have been already applied to the tablet,
we can completely skip processing that operation.
Prior to this patch, we would still decode the row ops, decode the schema,
acquire all of the row locks, and only then realize that all of the row ops
were already flushed.
It turns out that we don't actually need to decode the row ops in order to know
whether they need to be applied or not. The results in the commit messages are
sufficient. So, this patch changes the 'filtering' process to happen up front,
and adds a fast path in the case where all of the ops are skippable.
To quantify the improvement, I used tpch_real_world to load a 1GB dataset,
and then run a standalone tablet-server against the generated data directory.
I looked at the 'Time spent bootstrapping' log line before and after the change
(3 runs each).
Before:
Time spent bootstrapping tablet: real 9.520s user 6.052s sys 1.016s
Time spent bootstrapping tablet: real 9.718s user 5.988s sys 0.984s
Time spent bootstrapping tablet: real 9.535s user 6.048s sys 1.116s
After:
Time spent bootstrapping tablet: real 3.825s user 1.448s sys 1.000s
Time spent bootstrapping tablet: real 4.340s user 1.480s sys 0.988s
Time spent bootstrapping tablet: real 7.438s user 1.456s sys 1.080s
The user CPU is reduced about 4x by this patch.
Change-Id: Ida7c54b122d8abee407fb8863a911a4d3887a9cb
---
M src/kudu/tablet/row_op.h
M src/kudu/tablet/tablet_bootstrap.cc
2 files changed, 132 insertions(+), 111 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/37/3037/2
--
To view, visit http://gerrit.cloudera.org:8080/3037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ida7c54b122d8abee407fb8863a911a4d3887a9cb
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <[email protected]>