Mahesh Reddy has posted comments on this change. ( http://gerrit.cloudera.org:8080/19931 )
Change subject: KUDU-3476: Make replica placement range and table aware ...................................................................... Patch Set 13: (4 comments) http://gerrit.cloudera.org:8080/#/c/19931/12/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/19931/12/src/kudu/master/catalog_manager.cc@375 PS12, Line 375: true, > It seems we need to change this to 'true' if we want to benefit from from t Done http://gerrit.cloudera.org:8080/#/c/19931/12/src/kudu/master/catalog_manager.cc@377 PS12, Line 377: runtime); > I think this flag should be dropped if setting --enable_range_replica_place Done http://gerrit.cloudera.org:8080/#/c/19931/10/src/kudu/master/ts_descriptor.cc File src/kudu/master/ts_descriptor.cc: http://gerrit.cloudera.org:8080/#/c/19931/10/src/kudu/master/ts_descriptor.cc@214 PS10, Line 214: return; : } : : const double kHalflifeSecs = FLAGS_tserver_last_replica_creations_halflife_ms / 1000; : const MonoTime now = MonoTime::Now(); : // If map for the table or range hasn't been initialized yet, use init_time_ as last decay. : MonoTime last_decay = : last_replica_decay_by_range_.find(table_id) == last_replica_decay_by_range_.end() || : last_replica_decay_by_range_[table_id].first.fin > However, all this decay-style stuff means strange side-effects are possible I agree the decay code is quite hacky and not super precise. As explained in the comments in PickTabletServer() in placement_policy.cc, the recency code is there to "ensure that we take into account the recent selection decisions even if those replicas are still in the process of being created (and thus not yet reported by the server). This is important because, while creating a table, we batch the selection process before sending any creation commands to the servers themselves." So the tablet servers won't be able to report these pending replicas, but as you mentioned the system catalog has this information so we could potentially look at that. I'll add a todo for a follow up patch. I hadn't thought about this fast decay use case. My understanding is that it would take quite a few heartbeat cycles for the tablet servers to report the number of live replicas. This is because the requests to the servers would happen after the selection process for the replicas, at least from what I understood from the comment above. So IIUC, I don't think there's any possibility for the algorithm to double count replicas. If there's something I'm missing here in terms of the order of operations here, please let me know. http://gerrit.cloudera.org:8080/#/c/19931/10/src/kudu/tserver/ts_tablet_manager.h File src/kudu/tserver/ts_tablet_manager.h: http://gerrit.cloudera.org:8080/#/c/19931/10/src/kudu/tserver/ts_tablet_manager.h@222 PS10, Line 222: in INITIALIZED state, bu > I think there isn't any other state except for RUNNING, BOOTSTRAPPING, INIT Sounds good to me, I'll add a todo. -- To view, visit http://gerrit.cloudera.org:8080/19931 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9caeb8d5547e946bfeb152a99e1ec034c3fa0a0f Gerrit-Change-Number: 19931 Gerrit-PatchSet: 13 Gerrit-Owner: Mahesh Reddy <[email protected]> Gerrit-Reviewer: Abhishek Chennaka <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Ashwani Raina <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy <[email protected]> Gerrit-Reviewer: Marton Greber <[email protected]> Gerrit-Reviewer: Zoltan Chovan <[email protected]> Gerrit-Reviewer: Ádám Bakai <[email protected]> Gerrit-Comment-Date: Fri, 30 Jun 2023 07:06:28 +0000 Gerrit-HasComments: Yes
