Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16974 )
Change subject: KUDU-2612: add a TxnSystemClient to the tservers ...................................................................... Patch Set 5: (11 comments) http://gerrit.cloudera.org:8080/#/c/16974/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16974/4//COMMIT_MSG@23 PS4, Line 23: connecting to > nit: connecting to master? Done http://gerrit.cloudera.org:8080/#/c/16974/4//COMMIT_MSG@33 PS4, Line 33: I left a TODO to refactor for code reuse > SGTM -- I guess we can address that later on when the rest of the related p Ack http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/rpc/sasl_common.cc File src/kudu/rpc/sasl_common.cc: http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/rpc/sasl_common.cc@a280 PS4, Line 280: > Why this is disabled? Sorry,I think you briefly mentioned offline, but I di Servers initialize SASL in ServerBase::Init(), which calls the Once function with the keytab. Clients, when building their messengers, also call SASL, but don't use a keytab and instead rely on importing credentials from tokens. There's a mismatch when both of these happen in the same process, since SASL initialization sets some global state at L70. http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/transactions/txn_system_client.cc File src/kudu/transactions/txn_system_client.cc: http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/transactions/txn_system_client.cc@62 PS4, Line 62: disable_txn_system_client_init > I guess a flag tagged as 'unsafe' is automatically marked as 'hidden': http Right, 'unsafe' flags are automatically 'hidden'. http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/transactions/txn_system_client.cc@383 PS4, Line 383: ile (!shutting_down_) { : static const MonoDelta kRetryInterval = MonoDelta::FromSeconds(1); : if (PREDICT_FALSE(FLAGS_disable_txn_system_client_init)) { : KLOG_EVERY_N_SECS(WARNING, 60) << : Substitute("initialization of TxnSystemClient disabled, will retry in $0", : kRetryInterval.ToString()); : > As we are retrying, do you expect FLAGS_disable_txn_system_client_init to c I don't, but it can. http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/transactions/txn_system_client.cc@404 PS4, Line 404: > nit: does it make sense to set it to be equal to the connection negotiation Done http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/transactions/txn_system_client.cc@390 PS4, Line 390: continue; : } : // HACK: if the master addresses are all totally unreachable, : // KuduClientBuilder::Build() will hang, attempting fruitlessly to : // retry, in the below call to Create(). So first, make sure we can at : // least reach the masters; if not, try again. : // TODO(awong): there's still a small window between these pings and : // client creation. If this ends up being a problem, we may need to : // come to a more robust solution, e.g. adding a timeout to Create(). : DnsResolver dns_resolver; : Status s; : for (const auto& hp : master_addrs) { : vector<Sockaddr> addrs; : s = dns_resolver.ResolveAddresses(hp, &addrs).AndThen([&] { : unique_ptr<MasterServiceProxy> proxy( : new MasterServiceProxy(messenger, addrs[0], hp.host())); : PingRequestPB req; : PingResponsePB resp; : RpcController rpc; : rpc.set_timeout(MonoDelta::FromMilliseconds(FLAGS_rpc_negotiation_timeout_ms)); : > I guess the sanitize check is sufficient enough, even later when we actuall I'm not sure I understand the question. Are you asking what happens if, between the Ping() and Create() calls, the masters become unreachable? Yeah, it might be worth making this robust, though I'm going to hold off and write a TODO, since having all masters become unavailable at once seems indeed unlikely. http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/transactions/txn_system_client.cc@414 PS4, Line 414: } > nit: what if masters become unavailable right after the check above but bef Yeah, it might be worth making this robust, though I'm going to hold off and write a TODO, since having all masters become unavailable at once seems indeed unlikely. http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/transactions/txn_system_client.cc@435 PS4, Line 435: shutdown c > Do we want to check(dcheck) txn_client_ is not null here? Done http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/transactions/txn_system_client.cc@443 PS4, Line 443: nava > nit: why not Shutdown() after Wait()? It happens in the threadpool's destructor anyway, but I can add it here too I guess. http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/tserver/tablet_server.cc File src/kudu/tserver/tablet_server.cc: http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/tserver/tablet_server.cc@107 PS4, Line 107: ); > nit: parenthesis are not necessary here Done -- To view, visit http://gerrit.cloudera.org:8080/16974 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I33b5a2bb5c56ae4bb4b42069af5813e2780fc4bc Gerrit-Change-Number: 16974 Gerrit-PatchSet: 5 Gerrit-Owner: Andrew Wong <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Hao Hao <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Wed, 27 Jan 2021 23:39:44 +0000 Gerrit-HasComments: Yes
