Mahesh Reddy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19931 )

Change subject: KUDU-3476: Make replica placement range and table aware
......................................................................


Patch Set 13:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/19931/12/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/19931/12/src/kudu/master/catalog_manager.cc@375
PS12, Line 375: true,
> It seems we need to change this to 'true' if we want to benefit from from t
Done


http://gerrit.cloudera.org:8080/#/c/19931/12/src/kudu/master/catalog_manager.cc@377
PS12, Line 377: runtime);
> I think this flag should be dropped if setting --enable_range_replica_place
Done


http://gerrit.cloudera.org:8080/#/c/19931/10/src/kudu/master/ts_descriptor.cc
File src/kudu/master/ts_descriptor.cc:

http://gerrit.cloudera.org:8080/#/c/19931/10/src/kudu/master/ts_descriptor.cc@214
PS10, Line 214:     return;
              :   }
              :
              :   const double kHalflifeSecs = 
FLAGS_tserver_last_replica_creations_halflife_ms / 1000;
              :   const MonoTime now = MonoTime::Now();
              :   // If map for the table or range hasn't been initialized yet, 
use init_time_ as last decay.
              :   MonoTime last_decay =
              :       last_replica_decay_by_range_.find(table_id) == 
last_replica_decay_by_range_.end() ||
              :       last_replica_decay_by_range_[table_id].first.fin
> However, all this decay-style stuff means strange side-effects are possible
I agree the decay code is quite hacky and not super precise. As explained in 
the comments in PickTabletServer() in placement_policy.cc, the recency code is 
there to

"ensure that we take into account the recent selection decisions even if those 
replicas are still in the process of being created (and thus not yet reported 
by the server). This is important because, while creating a table, we batch the 
selection process before sending any creation commands to the servers 
themselves."

So the tablet servers won't be able to report these pending replicas, but as 
you mentioned the system catalog has this information so we could potentially 
look at that. I'll add a todo for a follow up patch.

I hadn't thought about this fast decay use case. My understanding is that it 
would take quite a few heartbeat cycles for the tablet servers to report the 
number of live replicas. This is because the requests to the servers would 
happen after the selection process for the replicas, at least from what I 
understood from the comment above. So IIUC, I don't think there's any 
possibility for the algorithm to double count replicas. If there's something 
I'm missing here in terms of the order of operations here, please let me know.


http://gerrit.cloudera.org:8080/#/c/19931/10/src/kudu/tserver/ts_tablet_manager.h
File src/kudu/tserver/ts_tablet_manager.h:

http://gerrit.cloudera.org:8080/#/c/19931/10/src/kudu/tserver/ts_tablet_manager.h@222
PS10, Line 222: in INITIALIZED state, bu
> I think there isn't any other state except for RUNNING, BOOTSTRAPPING, INIT
Sounds good to me, I'll add a todo.



--
To view, visit http://gerrit.cloudera.org:8080/19931
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9caeb8d5547e946bfeb152a99e1ec034c3fa0a0f
Gerrit-Change-Number: 19931
Gerrit-PatchSet: 13
Gerrit-Owner: Mahesh Reddy <[email protected]>
Gerrit-Reviewer: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Ashwani Raina <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy <[email protected]>
Gerrit-Reviewer: Marton Greber <[email protected]>
Gerrit-Reviewer: Zoltan Chovan <[email protected]>
Gerrit-Reviewer: Ádám Bakai <[email protected]>
Gerrit-Comment-Date: Fri, 30 Jun 2023 07:06:28 +0000
Gerrit-HasComments: Yes

Reply via email to