Repository: kudu
Updated Branches:
  refs/heads/master 656ed6a84 -> 60276c54a


fs: update default data dir group size

This patch makes fs_target_data_dirs_per_tablet non-experimental and
sets the new default to 3. Upon experimenting with the flag via YCSB
workloads, only read workload tail latency seemed to be affected at very
small group sizes (e.g. 1). 3 seems like a reasonable choice, and we can
always update it in the future.

Additionally, this change was tested on an 8-node cluster with an
analytical workload (TPC-H, scale factor: 300), each node with 7
data dirs and a separate WAL dir. I updated the disk group size to 3
and ran a TPC-H workload, results below averaged across 10 runs.

Size     | q1      q2      q3      q4      q5      q6     q7      q8      q9    
 q10     q12     q13     q14    q15    q16     q17     q18      q19    q20     
q21
All      | 13.931  10.628  30.917  47.363  37.468  1.914  54.252  23.196  
52.618 19.074  19.171  34.751  4.077  2.738  14.617  64.92   95.475   22.659 
25.13   155.384
Three    | 14.343  11.174  30.932  48.172  38.072  1.906  54.489  22.494  53.88 
 18.478  18.408  33.6    3.991  2.933  15.567  79.578  100.285  25.182 26.979  
153.592
Slowdown | 0.412   0.546   0.015   0.809   0.604   -0.008 0.237   -0.702  1.262 
 -0.596  -0.763  -1.151  -0.086 0.195  0.95    14.658  4.81     2.523  1.849   
-1.792

Note that existing data will not be updated with the new flag. Only new
tablets will honor the sizing.

Change-Id: I2dd0d3f8cf140f684318c0dffb45a1091302ecdc
Reviewed-on: http://gerrit.cloudera.org:8080/8995
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon <[email protected]>


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/60276c54
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/60276c54
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/60276c54

Branch: refs/heads/master
Commit: 60276c54a221d554287c6645df7df542fe6d6443
Parents: 656ed6a
Author: Andrew Wong <[email protected]>
Authored: Wed Jan 10 13:04:32 2018 -0800
Committer: Todd Lipcon <[email protected]>
Committed: Wed Jan 24 20:48:58 2018 +0000

----------------------------------------------------------------------
 src/kudu/fs/data_dirs.cc               | 15 ++++++++-------
 src/kudu/tserver/tablet_server-test.cc |  8 +++++---
 2 files changed, 13 insertions(+), 10 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/60276c54/src/kudu/fs/data_dirs.cc
----------------------------------------------------------------------
diff --git a/src/kudu/fs/data_dirs.cc b/src/kudu/fs/data_dirs.cc
index f883797..163cec7 100644
--- a/src/kudu/fs/data_dirs.cc
+++ b/src/kudu/fs/data_dirs.cc
@@ -62,16 +62,17 @@
 #include "kudu/util/test_util_prod.h"
 #include "kudu/util/threadpool.h"
 
-DEFINE_int32(fs_target_data_dirs_per_tablet, 0,
-              "Indicates the target number of data dirs to spread each "
-              "tablet's data across. If greater than the number of data dirs "
-              "available, data will be striped across those available. The "
-              "default value 0 indicates striping should occur across all "
-              "healthy data directories.");
+DEFINE_int32(fs_target_data_dirs_per_tablet, 3,
+             "Indicates the target number of data dirs to spread each "
+             "tablet's data across. If greater than the number of data dirs "
+             "available, data will be striped across those available. A "
+             "value of 0 indicates striping should occur across all healthy "
+             "data dirs. Using fewer data dirs per tablet means a single "
+             "drive failure will be less likely to affect a given tablet.");
 DEFINE_validator(fs_target_data_dirs_per_tablet,
     [](const char* /*n*/, int32_t v) { return v >= 0; });
 TAG_FLAG(fs_target_data_dirs_per_tablet, advanced);
-TAG_FLAG(fs_target_data_dirs_per_tablet, experimental);
+TAG_FLAG(fs_target_data_dirs_per_tablet, evolving);
 
 DEFINE_int64(fs_data_dirs_reserved_bytes, -1,
              "Number of bytes to reserve on each data directory filesystem for 
"

http://git-wip-us.apache.org/repos/asf/kudu/blob/60276c54/src/kudu/tserver/tablet_server-test.cc
----------------------------------------------------------------------
diff --git a/src/kudu/tserver/tablet_server-test.cc 
b/src/kudu/tserver/tablet_server-test.cc
index 9fddf0d..73b0b1d 100644
--- a/src/kudu/tserver/tablet_server-test.cc
+++ b/src/kudu/tserver/tablet_server-test.cc
@@ -63,6 +63,7 @@
 #include "kudu/gutil/ref_counted.h"
 #include "kudu/gutil/stringprintf.h"
 #include "kudu/gutil/strings/escaping.h"
+#include "kudu/gutil/strings/join.h"
 #include "kudu/gutil/strings/substitute.h"
 #include "kudu/rpc/messenger.h"
 #include "kudu/rpc/rpc_controller.h"
@@ -501,10 +502,11 @@ TEST_F(TabletServerDiskFailureTest, TestRandomOpSequence) 
{
   const int kMaxKey = 100000;
 
   // Set these way up-front so we can change a single value to actually start
-  // injecting errors.
+  // injecting errors. Inject errors into all data dirs but one.
   FLAGS_crash_on_eio = false;
-  FLAGS_env_inject_eio_globs =
-    JoinPathSegments(mini_server_->options()->fs_opts.data_roots[1], "**");
+  const vector<string> failed_dirs = { 
mini_server_->options()->fs_opts.data_roots.begin() + 1,
+                                       
mini_server_->options()->fs_opts.data_roots.end() };
+  FLAGS_env_inject_eio_globs = JoinStrings(JoinPathSegmentsV(failed_dirs, 
"**"), ",");
 
   set<int> keys;
   const auto GetRandomString = [] {

Reply via email to