Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/3037

to look at the new patch set (#2).

Change subject: bootstrap: filter row operations before decoding writes
......................................................................

bootstrap: filter row operations before decoding writes

This is an optimization to the bootstrap process such that, for the case where
all of the operations in a log entry have been already applied to the tablet,
we can completely skip processing that operation.

Prior to this patch, we would still decode the row ops, decode the schema,
acquire all of the row locks, and only then realize that all of the row ops
were already flushed.

It turns out that we don't actually need to decode the row ops in order to know
whether they need to be applied or not. The results in the commit messages are
sufficient. So, this patch changes the 'filtering' process to happen up front,
and adds a fast path in the case where all of the ops are skippable.

To quantify the improvement, I used tpch_real_world to load a 1GB dataset,
and then run a standalone tablet-server against the generated data directory.
I looked at the 'Time spent bootstrapping' log line before and after the change
(3 runs each).

Before:
  Time spent bootstrapping tablet: real 9.520s       user 6.052s     sys 1.016s
  Time spent bootstrapping tablet: real 9.718s       user 5.988s     sys 0.984s
  Time spent bootstrapping tablet: real 9.535s       user 6.048s     sys 1.116s

After:
  Time spent bootstrapping tablet: real 3.825s       user 1.448s     sys 1.000s
  Time spent bootstrapping tablet: real 4.340s       user 1.480s     sys 0.988s
  Time spent bootstrapping tablet: real 7.438s       user 1.456s     sys 1.080s

The user CPU is reduced about 4x by this patch.

Change-Id: Ida7c54b122d8abee407fb8863a911a4d3887a9cb
---
M src/kudu/tablet/row_op.h
M src/kudu/tablet/tablet_bootstrap.cc
2 files changed, 132 insertions(+), 111 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/37/3037/2
-- 
To view, visit http://gerrit.cloudera.org:8080/3037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ida7c54b122d8abee407fb8863a911a4d3887a9cb
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <[email protected]>

Reply via email to