Todd Lipcon has submitted this change and it was merged. (
http://gerrit.cloudera.org:8080/14868 )
Change subject: client: optimize destruction of WriteRpc
......................................................................
client: optimize destruction of WriteRpc
When writing batches with lots of operations, the WriteRpc destructor
ends up cache-miss bound, since the various InFlightOp and WriteOps are
strewn all about memory. This adds some prefetching which sped things up
noticeably (~37%) in a benchmark which ends up bound by the reactor thread on
the client side.
$ perf stat ./build/thinlto/bin/kudu perf loadgen localhost
-num_rows_per_thread=10000000 -num_threads=8
Before:
Generator report
time total : 51403.6 ms
time per row: 0.000642545 ms
Dropping auto-created table
'default.loadgen_auto_d289807fc12a4b1c861f79b19af9ec8e'
Performance counter stats for './build/thinlto/bin/kudu perf loadgen
localhost -num_rows_per_thread=10000000 -num_threads=8':
180,585.24 msec task-clock # 3.508 CPUs utilized
25,373 context-switches # 0.141 K/sec
1,648 cpu-migrations # 0.009 K/sec
50,927 page-faults # 0.282 K/sec
726,022,544,856 cycles # 4.020 GHz
(83.33%)
71,782,315,500 stalled-cycles-frontend # 9.89% frontend cycles
idle (83.36%)
412,273,652,207 stalled-cycles-backend # 56.79% backend cycles
idle (83.29%)
408,271,477,858 instructions # 0.56 insn per cycle
# 1.01 stalled cycles
per insn (83.35%)
75,750,045,948 branches # 419.470 M/sec
(83.33%)
296,247,270 branch-misses # 0.39% of all branches
(83.34%)
51.475433628 seconds time elapsed
178.590913000 seconds user
1.935099000 seconds sys
After:
Generator report
time total : 37293.8 ms
time per row: 0.000466172 ms
Dropping auto-created table
'default.loadgen_auto_ece2f41beef94a9fa032c77899f7e61c'
Performance counter stats for './build/thinlto/bin/kudu perf loadgen
localhost -num_rows_per_thread=10000000 -num_threads=8':
189,125.49 msec task-clock # 5.060 CPUs utilized
29,363 context-switches # 0.155 K/sec
2,043 cpu-migrations # 0.011 K/sec
48,405 page-faults # 0.256 K/sec
772,496,448,279 cycles # 4.085 GHz
(83.33%)
129,999,474,226 stalled-cycles-frontend # 16.83% frontend cycles
idle (83.36%)
300,049,388,250 stalled-cycles-backend # 38.84% backend cycles
idle (83.30%)
414,415,517,571 instructions # 0.54 insn per cycle
# 0.72 stalled cycles
per insn (83.32%)
76,829,647,882 branches # 406.236 M/sec
(83.34%)
352,749,453 branch-misses # 0.46% of all branches
(83.35%)
37.376785122 seconds time elapsed
186.834651000 seconds user
2.143945000 seconds sys
Change-Id: I538f995f7ec161e746885c6b31cd1dccd72139b0
Reviewed-on: http://gerrit.cloudera.org:8080/14868
Reviewed-by: Adar Dembo <[email protected]>
Tested-by: Todd Lipcon <[email protected]>
---
M src/kudu/client/batcher.cc
M src/kudu/common/partial_row.h
2 files changed, 80 insertions(+), 3 deletions(-)
Approvals:
Adar Dembo: Looks good to me, approved
Todd Lipcon: Verified
--
To view, visit http://gerrit.cloudera.org:8080/14868
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I538f995f7ec161e746885c6b31cd1dccd72139b0
Gerrit-Change-Number: 14868
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <[email protected]>