Todd Lipcon has submitted this change and it was merged. Change subject: bootstrap: filter row operations before decoding writes ......................................................................
bootstrap: filter row operations before decoding writes This is an optimization to the bootstrap process such that, for the case where all of the operations in a log entry have been already applied to the tablet, we can completely skip processing that operation. Prior to this patch, we would still decode the row ops, decode the schema, acquire all of the row locks, and only then realize that all of the row ops were already flushed. It turns out that we don't actually need to decode the row ops in order to know whether they need to be applied or not. The results in the commit messages are sufficient. So, this patch changes the 'filtering' process to happen up front, and adds a fast path in the case where all of the ops are skippable. To quantify the improvement, I used tpch_real_world to load a 1GB dataset, and then run a standalone tablet-server against the generated data directory. I looked at the 'Time spent bootstrapping' log line before and after the change (3 runs each). Before: Time spent bootstrapping tablet: real 9.520s user 6.052s sys 1.016s Time spent bootstrapping tablet: real 9.718s user 5.988s sys 0.984s Time spent bootstrapping tablet: real 9.535s user 6.048s sys 1.116s After: Time spent bootstrapping tablet: real 3.825s user 1.448s sys 1.000s Time spent bootstrapping tablet: real 4.340s user 1.480s sys 0.988s Time spent bootstrapping tablet: real 7.438s user 1.456s sys 1.080s The user CPU is reduced about 4x by this patch. Change-Id: Ida7c54b122d8abee407fb8863a911a4d3887a9cb Reviewed-on: http://gerrit.cloudera.org:8080/3037 Tested-by: Kudu Jenkins Reviewed-by: Adar Dembo <[email protected]> --- M src/kudu/tablet/row_op.h M src/kudu/tablet/tablet_bootstrap.cc 2 files changed, 132 insertions(+), 111 deletions(-) Approvals: Adar Dembo: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/3037 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ida7c54b122d8abee407fb8863a911a4d3887a9cb Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon <[email protected]>
