Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10633 )

Change subject: KUDU-1861: add range-partitions to loadgen tables
......................................................................


Patch Set 5:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/10633/5/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/10633/5/src/kudu/tools/kudu-tool-test.cc@1473
PS5, Line 1473:   ExternalMiniClusterOptions opts;
              :   opts.num_tablet_servers = 1;
              :   NO_FATALS(StartExternalMiniCluster(std::move(opts)));
nit: since std::move() is used for 'opts' which invalidates 'opts' for the rest 
of the scope, consider placing this code into a separate scope.


http://gerrit.cloudera.org:8080/#/c/10633/5/src/kudu/tools/kudu-tool-test.cc@1476
PS5, Line 1476:   vector<string> base_args = {
nit: const ?


http://gerrit.cloudera.org:8080/#/c/10633/5/src/kudu/tools/kudu-tool-test.cc@1479
PS5, Line 1479: "--num_rows_per_thread=2048",
Why default 1000 is not good enough here?


http://gerrit.cloudera.org:8080/#/c/10633/5/src/kudu/tools/kudu-tool-test.cc@1480
PS5, Line 1480:     "--flush_per_n_rows=1",
Could you add a comment explaining why manual flushing per row is essential in 
this case?


http://gerrit.cloudera.org:8080/#/c/10633/5/src/kudu/tools/kudu-tool-test.cc@1481
PS5, Line 1481: "--table_num_replicas=1"
It's 1 by default, so this option can be omitted.


http://gerrit.cloudera.org:8080/#/c/10633/5/src/kudu/tools/kudu-tool-test.cc@1482
PS5, Line 1482:     "--num_threads=3",
Why default is not good enough?  If it's not indeed, I think it's worth adding 
a comment about that.


http://gerrit.cloudera.org:8080/#/c/10633/5/src/kudu/tools/kudu-tool-test.cc@1487
PS5, Line 1487:     // Let's ensure nothing funny happens inserting across the 
entire keyspace.
              :     "--use_random",
By my experience, funny things happen when using random :)  Maybe, you meant to 
not using random here?


http://gerrit.cloudera.org:8080/#/c/10633/5/src/kudu/tools/kudu-tool-test.cc@1512
PS5, Line 1512: kTimeout
Is it stable enough for TSAN/ASAN builds with 10ms timeout?


http://gerrit.cloudera.org:8080/#/c/10633/5/src/kudu/tools/kudu-tool-test.cc@1512
PS5, Line 1512:   ASSERT_OK(WaitForNumTabletsOnTS(ts, expected_tablets, 
kTimeout, &tablets));
              :   ASSERT_EQ(expected_tablets, tablets.size());
Here and below: it seems these two lines of code can be reduced to

ASSERT_OK(WaitForNumTabletsOnTS(ts, expected_tablets, kTimeout));

So, the 'tablets' variable can be dropped.


http://gerrit.cloudera.org:8080/#/c/10633/5/src/kudu/tools/kudu-tool-test.cc@1534
PS5, Line 1534: workload result
nit: 'workloads result' or 'workload results'


http://gerrit.cloudera.org:8080/#/c/10633/5/src/kudu/tools/kudu-tool-test.cc@1556
PS5, Line 1556:     "--table_num_replicas=1",
              :     "--keep_auto_table=false",
              :     "--run_scan=false",
              :     "--use_random=false",
I think that can be dropped because that are the defaults values for those 
parameters.


http://gerrit.cloudera.org:8080/#/c/10633/5/src/kudu/tools/kudu-tool-test.cc@1568
PS5, Line 1568:     "--num_rows_per_thread=10000",
I think you could use '--string_len=32768' in addition if you want to generate 
'heavier' rows to reach the flush threshold sooner.


http://gerrit.cloudera.org:8080/#/c/10633/5/src/kudu/tools/tool_action_perf.cc
File src/kudu/tools/tool_action_perf.cc:

http://gerrit.cloudera.org:8080/#/c/10633/5/src/kudu/tools/tool_action_perf.cc@303
PS5, Line 303: DEFINE_int32(table_num_hash_partitions, 8,
             :              "The number of hash partitions to create when this 
tool creates "
             :              "a new table. Note: The total number of partitions 
must be "
             :              "greater than 1.");
             : DEFINE_int32(table_num_range_partitions, 1,
             :              "The number of range partitions to create when this 
tool creates "
             :              "a new table. A range partitioning schema will be 
determined to "
             :              "evenly split a sequential workload across ranges, 
leaving "
             :              "the outermost ranges unbounded to ensure coverage 
of the entire "
             :              "keyspace. Note: The total number of partitions 
must be greater "
             :              "than 1.");
Maybe, it's worth adding a group validator to verify that the total number of 
partitions is greater than 1?



--
To view, visit http://gerrit.cloudera.org:8080/10633
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4f552a4c73dc82f3b7934082769522557ee5013
Gerrit-Change-Number: 10633
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Fri, 22 Jun 2018 21:56:12 +0000
Gerrit-HasComments: Yes

Reply via email to