perf: improve string data generation performance

This changes the 'perf loadgen' tool to not use stringstream for
formatting a string. Instead, it reuses a single string buffer and uses
some functions from gutil to achieve the same output.

Tested with:
   build/latest/bin/kudu.orig perf loadgen -num-threads 16 \
        -num-rows-per-thread $[10*1000*1000] \
       -table-num-replicas 3 -table-num-buckets=100 \
        my-master

Before this fix:
  time total  : 130030 ms
    2204814.701333      task-clock (msec)         #   16.535 CPUs utilized
  Top profile lines:
    14.17%  kudu.orig        libstdc++.so.6.0.19  [.] std::locale::locale
     7.26%  kudu.orig        libstdc++.so.6.0.19  [.] std::locale::~locale
     5.91%  kudu.orig        libstdc++.so.6.0.19  [.] 
std::use_facet<std::num_put<char, std::ostreambuf_iterator<char, 
std::char_traits<char> > > >
     5.41%  kudu.orig        libstdc++.so.6.0.19  [.] 
std::has_facet<std::num_put<char, std::ostreambuf_iterator<char, 
std::char_traits<char> > > >
     4.69%  kudu.orig        libstdc++.so.6.0.19  [.] 
std::has_facet<std::num_get<char, std::istreambuf_iterator<char, 
std::char_traits<char> > > >

After this the throughput and CPU consumption were both improved by
about 40%:
  time total  : 90676.3 ms
    1458961.567442      task-clock (msec)         #   15.954 CPUs utilized

The top of the profile is now all related to client code instead of the
loadgen tool itself.

Change-Id: I4a3ebe39588d3da6a0775b168c5350fb420e025e
Reviewed-on: http://gerrit.cloudera.org:8080/9759
Reviewed-by: Adar Dembo <a...@cloudera.com>
Tested-by: Kudu Jenkins


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/ffcadd5a
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/ffcadd5a
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/ffcadd5a

Branch: refs/heads/master
Commit: ffcadd5a60107c5e150c0083f1008a745e976280
Parents: 9ecf7d7
Author: Todd Lipcon <t...@apache.org>
Authored: Thu Mar 22 10:20:17 2018 -0700
Committer: Todd Lipcon <t...@apache.org>
Committed: Thu Mar 22 18:34:48 2018 +0000

----------------------------------------------------------------------
 src/kudu/tools/tool_action_perf.cc | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/ffcadd5a/src/kudu/tools/tool_action_perf.cc
----------------------------------------------------------------------
diff --git a/src/kudu/tools/tool_action_perf.cc 
b/src/kudu/tools/tool_action_perf.cc
index 7c23bc8..2aeb514 100644
--- a/src/kudu/tools/tool_action_perf.cc
+++ b/src/kudu/tools/tool_action_perf.cc
@@ -135,6 +135,7 @@
 #include "kudu/gutil/map-util.h"
 #include "kudu/gutil/stl_util.h"
 #include "kudu/gutil/strings/split.h"
+#include "kudu/gutil/strings/strcat.h"
 #include "kudu/gutil/strings/substitute.h"
 #include "kudu/tools/tool_action_common.h"
 #include "kudu/util/decimal_util.h"
@@ -292,6 +293,7 @@ class Generator {
   uint64_t seq_;
   Random random_;
   const size_t string_len_;
+  string buf_;
 };
 
 template <>
@@ -311,15 +313,11 @@ float Generator::Next() {
 
 template <>
 string Generator::Next() {
-  ostringstream ss;
-  ss << NextImpl() << ".";
-  string str(ss.str());
-  if (str.size() >= string_len_) {
-    str = str.substr(0, string_len_);
-  } else {
-    str += string(string_len_ - str.size(), 'x');
-  }
-  return str;
+  buf_.clear();
+  StrAppend(&buf_, NextImpl(), ".");
+  // Truncate or extend with 'x's.
+  buf_.resize(string_len_, 'x');
+  return buf_;
 }
 
 Status GenerateRowData(Generator* gen, KuduPartialRow* row,

Reply via email to