Yao Xu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13632 )

Change subject: KUDU-2823 Place tablet replicas based on deminsion
......................................................................


Patch Set 11:

(33 comments)

Thanks for your comments.  I fixed some of the issues mentioned in the 
comments. But there are still some comments, I want to confirm it again, so I 
didn't update it in this patch. :)

http://gerrit.cloudera.org:8080/#/c/13632/10//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13632/10//COMMIT_MSG@7
PS10, Line 7: deminsion
> dimension
Done


http://gerrit.cloudera.org:8080/#/c/13632/10//COMMIT_MSG@9
PS10, Line 9: the newly
            : created tablet to be evenly distributed on each tablet server
> maybe, rephrase for clarity:
Done


http://gerrit.cloudera.org:8080/#/c/13632/10//COMMIT_MSG@25
PS10, Line 25: deminsion
> dimension
Done


http://gerrit.cloudera.org:8080/#/c/13632/10//COMMIT_MSG@27
PS10, Line 27: deminsion
> dimension
Done


http://gerrit.cloudera.org:8080/#/c/13632/10//COMMIT_MSG@30
PS10, Line 30: deminsion
> dimension
Done


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/client/client.h
File src/kudu/client/client.h:

http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/client/client.h@882
PS10, Line 882: the tablet
> Which tablet?  Or you meant 'table'?
Done


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/client/client.h@1210
PS10, Line 1210: to be created. D
> to be created ?
Done


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/integration-tests/create-table-itest.cc
File src/kudu/integration-tests/create-table-itest.cc:

http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/integration-tests/create-table-itest.cc@228
PS10, Line 228:   vector<string> master_fl
> maybe, drop this since it's empty anyways.
Done


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/integration-tests/create-table-itest.cc@229
PS10, Line 229:       "--master_place_tablet_replicas_based_on_dimension=true",
              :       "--tserver_last_replica_creations_halflife_ms=10",
              :   };
> nit: the shorter form might be
Done


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/integration-tests/create-table-itest.cc@233
PS10, Line 233: {}, mast
> nit: {}
Done


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/integration-tests/create-table-itest.cc@321
PS10, Line 321: kN
> kNumServers ?
Done


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/master/catalog_manager.cc@708
PS10, Line 708:
> why boost::optional<string>* ?  It seems string* as a type for the
 > output parameter is good enough here, no?

I'm very sorry, this is my mistake. This code should be removed. We should use 
the dimension label of the tablet instead of the range.


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/master/catalog_manager.cc@3733
PS10, Line 3733:
> dimension ?
Done


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/master/catalog_manager.cc@3748
PS10, Line 3748:     }
               :     DCHECK_GE(replication_factor, 1);
               :     const auto num_tservers_needed =
               :
> I think that in case of failure to get partition's debug string
 > it's safe enough to just log a warning and proceed with empty/none
 > dimension label.
 >
 > The idea is that this method is used for re-replication as well,
 > and it's better to place a replica maybe a bit off from the
 > dimension-dictated destination, but still preserve availability of
 > the tablet's data.

Thank you for your explanation. I think after drop GetRangePartitionDebugString 
function, we won't need to modify the code here. :)


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/master/catalog_manager.cc@4688
PS10, Line 4688:
> is enabled
Done


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/master/catalog_manager.cc@4688
PS10, Line 4688:
> dimension
Done


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/master/master.proto
File src/kudu/master/master.proto:

http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/master/master.proto@321
PS10, Line 321: in re
> drop
Done


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/master/placement_policy-test.cc
File src/kudu/master/placement_policy-test.cc:

http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/master/placement_policy-test.cc@904
PS10, Line 904:           { "ts1", 1000, { { "labelA", 1000 }, } },
              :           { "ts2", 1000, { { "labelA", 1000 },
> It would be nice to add a few scenarios where:
 >
 > * tablets for multiple dimensions label are being added (like 5
 > different dimensions) for the same table
 > * the initial distribution of tablet replicas has a few 'deeps' and
 > 'peaks' around some average level: e.g., 10, 20, 30
 > * the replicas are placed for tablets with replication factor of 3
 > (i.e. the first argument for PlaceTabletReplicas() is 3, not only
 > 1).
 > * verify how PlaceExtraTabletReplica() works as well
 > * there scenarios should cover the cases where the replication
 > factor is a) equal to the number of tablet servers b) replication
 > factor + 1 (e.g., for RF=3 that's 4 tablet servers) c) two times of
 > the replication factor (e.g. 6 tablet servers or more for RF=3).
 >
 > In the end, it's necessary to validate the overall distribution of
 > the replicas cluster-wise and label-wise.

Ok, I will add tests for these scenarios on the next update. I have a question, 
how to validate the overall distribution of the replicas cluster-wise? validate 
cluster-wise is skewed?


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/master/placement_policy-test.cc@910
PS10, Line 910: fun
> nit: func
Done


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/master/placement_policy-test.cc@930
PS10, Line 930: boost::none
> It would be nice to have a coverage for the scenario when
 > label-specific replicas are added in the case of huge imbalance of
 > total number of replicas.

This is to verify that there is skew when placing a newly created tablet 
without dimension label. If this is not necessary, I think I can remove these 
code.


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/master/placement_policy-test.cc@933
PS10, Line 933: ++placement_stats[ts_uuid];
> drop this: it does not provide much of the semantically meaningful coverage
Done


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/master/placement_policy-test.cc@941
PS10, Line 941:
> Did you run it many times (like thousands) to be sure that's a good
 > enough threshold to avoid flakiness in the future runs?

I ran this case 1000 times, the smallest stddev is 19.1. So I changed this to 
15.0, I think it will be safer.


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/master/placement_policy-test.cc@944
PS10, Line 944: ;
> For clarity, it's better to use some other label which is not easily confla
Done


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/master/placement_policy-test.cc@944
PS10, Line 944: info)
> with dimension label ?
Done


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/master/placement_policy-test.cc@944
PS10, Line 944: SSERT_O
> Place
Done


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/master/placement_policy-test.cc@957
PS10, Line 957:
> I'm not sure this assert has some semantically meaningful coverage: sure, t
Done


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/master/placement_policy-test.cc@964
PS10, Line 964:
> Is it crucial to output from the test?  If that was just for
 > debugging purposes, consider removing this once the test is ready.

Ok, I will remove it.


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/master/placement_policy.cc
File src/kudu/master/placement_policy.cc:

http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/master/placement_policy.cc@56
PS10, Line 56:   // TODO (oclarms): get the number of times this tablet server 
has recently been
             :   //  selected to create a tablet replica by dimension.
> Is this about somehow amplifying the numbers already accounted for
 > by RecentReplicaCreations()?  I'm curious why that is needed.  Does
 > it mean the current way of accounting is not good enough?

I think so, when we add tablets of different dimensions at the same time, this 
may result in uneven distribution of tablets in some dimension.

However, I don't think this will happen often, so I plan to improve it in the 
next patch.


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/master/placement_policy.cc@58
PS10, Line 58: return desc->RecentReplicaCreations() + 
desc->num_live_replicas(std::move(dimension));
> What if measuring the load of two tablet servers, where nothing has
 > been created so far (so both  RecentReplicaCreations() and
 > num_live_replicas(label) are 0), but first has 10, but second 20
 > tablet replicas total?  I think the former one should be considered
 > less loaded than the second.  Or you want those dimension to be
 > completely independent from the overall load?
 >
 > Maybe, add num_live_replicas() / 10000 or something like that into
 > the sum?  Ideally, that should be something that expresses the
 > preference of using dimension-only count against  weighted 'all
 > other counts'.  I'm not sure what's the best way to express it
 > here, though.  Probably, using 1/10000 is good enough initially
 > given it's not expected to host that many replicas per tablet
 > server :)

I think these dimensions are completely independent of the overall load to be a 
good choice. Maybe using the rebalance tool to solve the problem of uneven 
overall load will be better to me. :)


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/master/placement_policy.cc@73
PS10, Line 73: if
> nit:  If --> if
Done


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/master/ts_descriptor.h
File src/kudu/master/ts_descriptor.h:

http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/master/ts_descriptor.h@136
PS10, Line 136: total
> total number
Done


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/tserver/ts_tablet_manager.h
File src/kudu/tserver/ts_tablet_manager.h:

http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/tserver/ts_tablet_manager.h@204
PS10, Line 204: Get th
> nit: it's not the result value of this method per se, so maybe replace 'Ret
Done


http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

http://gerrit.cloudera.org:8080/#/c/13632/10/src/kudu/tserver/ts_tablet_manager.cc@1248
PS10, Line 1248: DCHEC
> nit: it's safe to use DCHECK here since in release build it would crash wit
Done



--
To view, visit http://gerrit.cloudera.org:8080/13632
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48a225e221eb42ef2f5489687e80a151d8dc1a42
Gerrit-Change-Number: 13632
Gerrit-PatchSet: 11
Gerrit-Owner: Yao Xu <ocla...@gmail.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Yao Xu <ocla...@gmail.com>
Gerrit-Comment-Date: Tue, 09 Jul 2019 03:01:41 +0000
Gerrit-HasComments: Yes

Reply via email to