[kudu-CR] [c++ client] re-acquire authn token if expired
Alexey Serbin has posted comments on this change. Change subject: [c++ client] re-acquire authn token if expired .. Patch Set 16: (1 comment) http://gerrit.cloudera.org:8080/#/c/6648/16/src/kudu/client/client-internal.cc File src/kudu/client/client-internal.cc: Line 753: if (leader_master_rpc_primary_creds_ || need_new_rpc_primary_creds) { > I was having a hard time following this, so I tried to take a crack at redu I asked Mike to look at this as well to have an additional independent reviewer :) To Mike the reference thing was also confusing. Also, both variants looked not so good to him, so I decided to go with the alternative that you proposed: if you trust that code better, let it be the trusted code. I also found it doing the right thing, but I also spent some time to make sure it behaves like I want it to behave. -- To view, visit http://gerrit.cloudera.org:8080/6648 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5 Gerrit-PatchSet: 16 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] [c++ client] re-acquire authn token if expired
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6648 to look at the new patch set (#18). Change subject: [c++ client] re-acquire authn token if expired .. [c++ client] re-acquire authn token if expired Updated C++ client to re-acquire authn token if server responds with FATAL_INVALID_AUTHENTICATION_TOKEN error on connection negotiation. If that happens while negotiating a connection for a call, the connection is closed and the client reconnects to the cluster to get a new authn token. After successfully re-acquiring authn token the client retries the RPC call again. Added corresponding integration test to cover authn token re-acquisition for various scenarios. Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5 --- M docs/known_issues.adoc M docs/security.adoc M src/kudu/client/batcher.cc M src/kudu/client/client-internal.cc M src/kudu/client/client-internal.h M src/kudu/client/master_rpc.cc M src/kudu/client/master_rpc.h M src/kudu/client/meta_cache.cc M src/kudu/client/scanner-internal.cc M src/kudu/client/scanner-internal.h M src/kudu/integration-tests/CMakeLists.txt A src/kudu/integration-tests/authn_token_expire-itest.cc M src/kudu/integration-tests/cluster_verifier.h M src/kudu/rpc/client_negotiation.cc M src/kudu/rpc/client_negotiation.h M src/kudu/rpc/connection.cc M src/kudu/rpc/connection.h M src/kudu/rpc/negotiation.cc M src/kudu/rpc/outbound_call.cc M src/kudu/rpc/reactor.cc M src/kudu/rpc/reactor.h M src/kudu/rpc/retriable_rpc.h M src/kudu/rpc/rpc.h M src/kudu/rpc/server_negotiation.cc 24 files changed, 752 insertions(+), 140 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/48/6648/18 -- To view, visit http://gerrit.cloudera.org:8080/6648 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5 Gerrit-PatchSet: 18 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] Refactor ConsensusStatePB to hold committed and pending configs
Mike Percy has posted comments on this change. Change subject: Refactor ConsensusStatePB to hold committed and pending configs .. Patch Set 10: Code-Review+1 (1 comment) Looks good to me, just one nit. http://gerrit.cloudera.org:8080/#/c/6809/10/src/kudu/consensus/quorum_util.cc File src/kudu/consensus/quorum_util.cc: PS10, Line 199: nit: sorry to nitpick so hard on this, can we just align this opening quote with the opening quote on the above line? The GSG says it best: https://google.github.io/styleguide/cppguide.html#Function_Calls If the arguments do not all fit on one line, they should be broken up onto multiple lines, with each subsequent line aligned with the first argument. Do not add spaces after the open paren or before the close paren: bool result = DoSomething(averyveryveryverylongargument1, argument2, argument3); -- To view, visit http://gerrit.cloudera.org:8080/6809 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4bc4bdd9752fc29a7ce2cefcdc95c4366f5353af Gerrit-PatchSet: 10 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: Yes
[kudu-CR] [rpc] introduce per-RPC credentials policy
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6875 to look at the new patch set (#7). Change subject: [rpc] introduce per-RPC credentials policy .. [rpc] introduce per-RPC credentials policy This patch introduces policy for RPC authentication credentials. The authentication credentials policy allows for control over the type of client-side credentials used for making a remote procedure call. The idea behind this change is simple: sometimes the server's behavior depends on the type of client's credentials used to authenticate the client to the server in the context of the remote procedure call. If the client expects some particular behavior from the server, it has to explicitly specify the type of credentials it wants to use for the call. One example of an RPC depending on the type of the specified credentials is MasterService::ConnectToMaster(). It's impossible to receive an authentication token from the master if calling that method over a connection established with an authn token. To get a new authn token in that case, it's necessary to open a new connection to the master using types of credentials other than authn token (e.g., Kerberos credentials or TLS certificate will work). In other words, derived/secondary authentication credentials (such as authn token) can only be acquired if using the primary ones. That's a crucial restriction to allow for enforcing expiration of derived/secondary credentials. With this patch a client has an ability to re-acquire secondary authentication credentials (authn token) regardless of the type of credentials used to established current connection to Kudu master. As a part of this patch, a new unit test is added to cover the new functionality. Change-Id: I52f806e7b6f6362f66148530124e748e199ae6c2 --- M src/kudu/rpc/connection.cc M src/kudu/rpc/connection.h M src/kudu/rpc/messenger.h M src/kudu/rpc/negotiation.cc M src/kudu/rpc/outbound_call.h M src/kudu/rpc/proxy.cc M src/kudu/rpc/reactor.cc M src/kudu/rpc/reactor.h M src/kudu/rpc/rpc-test-base.h M src/kudu/rpc/rpc-test.cc M src/kudu/rpc/rpc_controller.cc M src/kudu/rpc/rpc_controller.h M src/kudu/rpc/rpc_stub-test.cc 13 files changed, 305 insertions(+), 75 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/6875/7 -- To view, visit http://gerrit.cloudera.org:8080/6875 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I52f806e7b6f6362f66148530124e748e199ae6c2 Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] [c++ client] re-acquire authn token if expired
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6648 to look at the new patch set (#17). Change subject: [c++ client] re-acquire authn token if expired .. [c++ client] re-acquire authn token if expired Updated C++ client to re-acquire authn token if server responds with FATAL_INVALID_AUTHENTICATION_TOKEN error on connection negotiation. If that happens while negotiating a connection for a call, the connection is closed and the client reconnects to the cluster to get a new authn token. After successfully re-acquiring authn token the client retries the RPC call again. Added corresponding integration test to cover authn token re-acquisition for various scenarios. Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5 --- M docs/known_issues.adoc M docs/security.adoc M src/kudu/client/batcher.cc M src/kudu/client/client-internal.cc M src/kudu/client/client-internal.h M src/kudu/client/master_rpc.cc M src/kudu/client/master_rpc.h M src/kudu/client/meta_cache.cc M src/kudu/client/scanner-internal.cc M src/kudu/client/scanner-internal.h M src/kudu/integration-tests/CMakeLists.txt A src/kudu/integration-tests/authn_token_expire-itest.cc M src/kudu/rpc/client_negotiation.cc M src/kudu/rpc/client_negotiation.h M src/kudu/rpc/connection.cc M src/kudu/rpc/connection.h M src/kudu/rpc/negotiation.cc M src/kudu/rpc/outbound_call.cc M src/kudu/rpc/reactor.cc M src/kudu/rpc/reactor.h M src/kudu/rpc/retriable_rpc.h M src/kudu/rpc/rpc.h M src/kudu/rpc/server_negotiation.cc 23 files changed, 754 insertions(+), 142 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/48/6648/17 -- To view, visit http://gerrit.cloudera.org:8080/6648 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5 Gerrit-PatchSet: 17 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] [c++ client] re-acquire authn token if expired
Alexey Serbin has posted comments on this change. Change subject: [c++ client] re-acquire authn token if expired .. Patch Set 16: (12 comments) http://gerrit.cloudera.org:8080/#/c/6648/16/docs/known_issues.adoc File docs/known_issues.adoc: PS16, Line 172: one week is now supported only for Kudu C++ client > nit: I think for the context of limitations, it's better to be explicit abo Done http://gerrit.cloudera.org:8080/#/c/6648/16/docs/security.adoc File docs/security.adoc: PS16, Line 221: Java > nit: insert "The" Done PS16, Line 223: yet > nit: we decided at some point to not use these time-related words in the do Done PS16, Line 223: C++ > nit: the Done PS16, Line 224: client > clients Done http://gerrit.cloudera.org:8080/#/c/6648/16/src/kudu/client/batcher.cc File src/kudu/client/batcher.cc: PS16, Line 415: existring > typo Done http://gerrit.cloudera.org:8080/#/c/6648/16/src/kudu/client/client-internal.cc File src/kudu/client/client-internal.cc: PS16, Line 731: ANY_BUT_AUTHN_OTKEN > I suggested the new name, so I'll go to bat for it. Besides being a positi ok. ANY_CREDENTIALS and PRIMARY_CREDENTIALS, right? Line 740: DCHECK(creds_policy == CredentialsPolicy::ANY_CREDENTIALS || > Isn't this statically verified by the compiler? I don't think it's going to be verified by compiler if somebody adds a new credentials policy, like SECONDARY_CREDENTIALS (meaning only secondary credentials may be used for connection negotiation). This debug-only assert will catch the case when this piece of code is not updated after that. Line 753: if (leader_master_rpc_primary_creds_ || need_new_rpc_primary_creds) { > I was having a hard time following this, so I tried to take a crack at redu To me -- nope, it's not :) I also spent some time trying to make sure that your code does what I want. It also has a misleading comment on non-existent connections if the final 'else' part. The two boolean flags store the fact whether it's necessary to create new RPC calls with appropriate credentials. Line 761: scoped_refptr& rpc( > This statement confusing. I think it's copying the scoped_refptr to an unn What would be that 'unnamed stack-allocated variable'? It just makes a reference either to leader_master_rpc_primary_creds_ or to leader_master_rpc_any_creds_ I'm not sure I understand: what unbound variable are you talking about? http://gerrit.cloudera.org:8080/#/c/6648/16/src/kudu/client/client-internal.h File src/kudu/client/client-internal.h: PS16, Line 165: rpc::CredentialsPolicy creds_policy = : rpc::CredentialsPolicy::ANY_CREDENTIALS); > maybe it's just me, but I find this wrapping a bit odd. Maybe wrap all the I wrapped all parameters at +4 shift -- that looks cleaner (at least no additional line just for the default value). PS16, Line 237: scoped_refptr leader_master_rpc_any_creds_; : std::vector leader_master_callbacks_any_creds_; : scoped_refptr leader_master_rpc_primary_creds_; : std::vector leader_master_callbacks_primary_creds_; > Yes, the idea was to not wait for that re-connection again in that rare cas Also, if we don't split the callbacks, how do we know the process of scheduling the retries converges at all? I.e., consider the case when a retries are scheduled again and again, and the any-creds call always arrives first all the time, then comes the primary-creds call, and that so long? -- To view, visit http://gerrit.cloudera.org:8080/6648 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5 Gerrit-PatchSet: 16 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-2001 Add UNDO size to tablet on-disk size
Mike Percy has posted comments on this change. Change subject: KUDU-2001 Add UNDO size to tablet on-disk size .. Patch Set 5: Code-Review+1 lgtm modulo the commit message, will defer to Todd for final approval -- To view, visit http://gerrit.cloudera.org:8080/6850 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I59001adadb9a768a464e7b2cf0f0a5df0ef5393a Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: No
[kudu-CR] KUDU-2001 Add UNDO size to tablet on-disk size
Mike Percy has posted comments on this change. Change subject: KUDU-2001 Add UNDO size to tablet on-disk size .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/6850/5//COMMIT_MSG Commit Message: Line 12: DeltaTracker::EstimateOnDiskSizeForCompaction() that > this got renamed in the latest rev? Yeah, renamed to EstimateRedoDeltaOnDiskSize() Also yeah, for Git the typical wrap length is 72 chars per https://stackoverflow.com/questions/2290016/git-commit-messages-50-72-formatting http://gerrit.cloudera.org:8080/#/c/6850/5/src/kudu/tablet/diskrowset.cc File src/kudu/tablet/diskrowset.cc: Line 692: return base_data_->EstimateOnDiskSize() + delta_tracker_->EstimateOnDiskSize(); > since this now includes the UNDO deltas, whereas before it didn't, will thi The major delta compaction estimate is below on L742 which now calls into L683 here so this looks right to me. I searched the code base for other uses of EstimateOnDiskSize() and I didn't find anything that could affect compaction policy. -- To view, visit http://gerrit.cloudera.org:8080/6850 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I59001adadb9a768a464e7b2cf0f0a5df0ef5393a Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: Yes
[kudu-CR] [rpc] introduce per-RPC credentials policy
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6875 to look at the new patch set (#6). Change subject: [rpc] introduce per-RPC credentials policy .. [rpc] introduce per-RPC credentials policy This patch introduces policy for RPC authentication credentials. The authentication credentials policy allows for control over the type of client-side credentials used for making a remote procedure call. The idea behind this change is simple: sometimes the server's behavior depends on the type of client's credentials used to authenticate the client to the server in the context of the remote procedure call. If the client expects some particular behavior from the server, it has to explicitly specify the type of credentials it wants to use for the call. One example of an RPC depending on the type of the specified credentials is MasterService::ConnectToMaster(). It's impossible to receive an authentication token from the master if calling that method over a connection established with an authn token. To get a new authn token in that case, it's necessary to open a new connection to the master using types of credentials other than authn token (e.g., Kerberos credentials or TLS certificate will work). In other words, derived/secondary authentication credentials (such as authn token) can only be acquired if using the primary ones. That's a crucial restriction to allow for enforcing expiration of derived/secondary credentials. With this patch a client has an ability to re-acquire secondary authentication credentials (authn token) regardless of the type of credentials used to established current connection to Kudu master. As a part of this patch, a new unit test is added to cover the new functionality. Change-Id: I52f806e7b6f6362f66148530124e748e199ae6c2 --- M src/kudu/rpc/connection.cc M src/kudu/rpc/connection.h M src/kudu/rpc/messenger.h M src/kudu/rpc/negotiation.cc M src/kudu/rpc/outbound_call.h M src/kudu/rpc/proxy.cc M src/kudu/rpc/reactor.cc M src/kudu/rpc/reactor.h M src/kudu/rpc/rpc-test-base.h M src/kudu/rpc/rpc-test.cc M src/kudu/rpc/rpc_controller.cc M src/kudu/rpc/rpc_controller.h M src/kudu/rpc/rpc_stub-test.cc 13 files changed, 305 insertions(+), 75 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/6875/6 -- To view, visit http://gerrit.cloudera.org:8080/6875 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I52f806e7b6f6362f66148530124e748e199ae6c2 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] [rpc] introduce per-RPC credentials policy
Alexey Serbin has posted comments on this change. Change subject: [rpc] introduce per-RPC credentials policy .. Patch Set 5: (4 comments) http://gerrit.cloudera.org:8080/#/c/6875/5/src/kudu/rpc/negotiation.cc File src/kudu/rpc/negotiation.cc: PS5, Line 163: SatisfiesCredentialsPolicy > yea, I think my main complaint about this is that we're calling SatisfiesCr We discussed this on-line (kudu-security Slack channel) and decided to add accessor for 'credentials_policy_' member. Line 215: conn->set_remote_features(client_negotiation.take_server_features()); > I think we're being pessimistic here in a very certain case: We discussed that on Slack kudu-security channel and the decision is to keep it as is (keep it simple). The issue with AuthenticationType is that it's not known before actual negotiation has happened, but we need to know about the policy of any connection prior to that because we want to find appropriate connection in ReactorThread::FindOrStartConnection(). Also, we discussed 'upgrading' the policy de facto as soon as the connection has been negotiated, but that seems to bring some additional complexity to fix a very corner case when authentication token expires on a connection which has been negotiated using the ANY_CREDENTIALS policy, but the actually there were no authentication token available to use, so the connection ended up to be satisfying the PRIMARY_CREDENTIALS policy. http://gerrit.cloudera.org:8080/#/c/6875/5/src/kudu/rpc/reactor.cc File src/kudu/rpc/reactor.cc: PS5, Line 368: for (auto it = range.first; it != range.second; ++it) { : const auto& c = it->second; : if (c->scheduled_for_shutdown()) { : // This connection is not to be used for any new calls. : continue; : } : if (!c->SatisfiesCredentialsPolicy(cred_policy)) { : // Do not use a connection with a non-compliant credentials policy. : // Instead, open a new one, while marking the former as scheduled for : // shutdown. This process converges: any connection that satisfies the : // PRIMARY_CREDENTIALS policy automatically satisfies the ANY_CREDENTIALS : // policy as well. The idea is to keep only one usable connection identified : // by the specified 'conn_id'. : c->set_scheduled_for_shutdown(); : continue; : } : : // If the test-only 'one-connection-per-RPC' mode is enabled, connections : // are re-established at every RPC call. : if (PREDICT_FALSE(FLAGS_rpc_reopen_outbound_connections)) { : c->set_scheduled_for_shutdown(); : continue; : } : } : // Second pass: shutdown idle connections to the target destination if they : // have been marked. Non-idle connections will be taken care of later by the : // idle connection scanner. : scoped_refptr found_conn; : for (auto it = range.first; it != range.second;) { : const auto& c = it->second; : if (c->scheduled_for_shutdown()) { : if (c->Idle()) { : DCHECK_EQ(Connection::CLIENT, c->direction()); : c->Shutdown(Status::NetworkError("connection is closed due to non-reuse policy")); : it = client_conns_.erase(it); : continue; : } : } else { : DCHECK(!found_conn); : found_conn = c; : // Appropriate connection is found; continue further to take care of the : // rest of connections marked for shutdown by the first pass above. : } : ++it; : } > I think these loops can be simplified by combining them: I thought breaking this into two parts would give the code better readability, so it was intentional. I see that's not the case. Fine, I'll update this piece. http://gerrit.cloudera.org:8080/#/c/6875/5/src/kudu/rpc/rpc-test.cc File src/kudu/rpc/rpc-test.cc: PS5, Line 295: PRC > RPC Done -- To view, visit http://gerrit.cloudera.org:8080/6875 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I52f806e7b6f6362f66148530124e748e199ae6c2 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] [c++ client] re-acquire authn token if expired
Dan Burkert has posted comments on this change. Change subject: [c++ client] re-acquire authn token if expired .. Patch Set 16: (1 comment) http://gerrit.cloudera.org:8080/#/c/6648/16/src/kudu/client/client-internal.cc File src/kudu/client/client-internal.cc: PS16, Line 731: ANY_BUT_AUTHN_OTKEN > You are right -- that's the old name (with an additional typo). Now it's P I suggested the new name, so I'll go to bat for it. Besides being a positive label, which I think is always better, it leaves the door open for considering IPKI certs to be non-primary in the future. We don't have any reason to do this at the moment, but I can imagine we might want to add a tserver->master API which is only valid when connecting via primary credentials. In fact, requesting a signed cert should probably already be like this. -- To view, visit http://gerrit.cloudera.org:8080/6648 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5 Gerrit-PatchSet: 16 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] [c++ client] re-acquire authn token if expired
Dan Burkert has posted comments on this change. Change subject: [c++ client] re-acquire authn token if expired .. Patch Set 16: (4 comments) http://gerrit.cloudera.org:8080/#/c/6648/16/src/kudu/client/client-internal.cc File src/kudu/client/client-internal.cc: PS16, Line 731: ANY_BUT_AUTHN_OTKEN PRIMARY Line 740: DCHECK(creds_policy == CredentialsPolicy::ANY_CREDENTIALS || Isn't this statically verified by the compiler? Line 753: if (leader_master_rpc_primary_creds_ || need_new_rpc_primary_creds) { I was having a hard time following this, so I tried to take a crack at reducing the number of branches, this is what I came up with (in particular the two boolean flags are removed, since I had trouble following those): if (leader_master_rpc_primary_creds_) { // Piggy back on an existing connection with primary credentials. leader_master_callbacks_primary_creds_.push_back(cb); } else if (leader_master_rpc_any_creds_ && cred_policy == CredentialsPolicy::ANY_CREDENTIALS) { // Piggy back on an existing connection with any credentials. leader_master_callbacks_any_creds_.push_back(cb); } else { // There are no existing connections, create a new one with the required policy. scoped_refptr rpc( new internal::ConnectToClusterRpc( std::bind(::Data::ConnectedToClusterCb, this, std::placeholders::_1, std::placeholders::_2, std::placeholders::_3, creds_policy), std::move(master_sockaddrs), deadline, client->default_rpc_timeout(), messenger_, creds_policy)); switch (creds_policy) { case PRIMARY_CREDENTIALS: leader_master_rpc_primary_creds_.reset(rpc.get()); leader_master_callbacks_primary_creds_.push_back(cb); break; case ANY_CREDENTIALS: leader_master_rpc_any_creds_.reset(rpc.get()); leader_master_callbacks_any_creds_.push_back(cb); break; } l.unlock(); rpc->SendRpc(); } hopefully that's simpler? Line 761: scoped_refptr& rpc( This statement confusing. I think it's copying the scoped_refptr to an unnamed stack allocated variable, and then taking a reference to that, but I'm not really sure. I think what you want is scoped_refptr& rpc = need_new_rpc_primary_creds ? leader_master_rpc_primary_creds_ : leader_master_rpc_any_creds_; I really think we should forbid taking references to unbound variables, the lifetime semantics are just too confusing, and there's never a good reason to do it. -- To view, visit http://gerrit.cloudera.org:8080/6648 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5 Gerrit-PatchSet: 16 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] [c++ client] re-acquire authn token if expired
Alexey Serbin has posted comments on this change. Change subject: [c++ client] re-acquire authn token if expired .. Patch Set 16: (2 comments) http://gerrit.cloudera.org:8080/#/c/6648/16/src/kudu/client/client-internal.cc File src/kudu/client/client-internal.cc: PS16, Line 731: ANY_BUT_AUTHN_OTKEN > this isn't the current name (though I actually like something like you have You are right -- that's the old name (with an additional typo). Now it's PRIMARY_CREDENTIALS. I'll update this. The new naming looks more generic. http://gerrit.cloudera.org:8080/#/c/6648/16/src/kudu/client/client-internal.h File src/kudu/client/client-internal.h: PS16, Line 237: scoped_refptr leader_master_rpc_any_creds_; : std::vector leader_master_callbacks_any_creds_; : scoped_refptr leader_master_rpc_primary_creds_; : std::vector leader_master_callbacks_primary_creds_; > is it really necessary to split the callbacks? eg if you try to request a C Yes, the idea was to not wait for that re-connection again in that rare case. -- To view, visit http://gerrit.cloudera.org:8080/6648 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5 Gerrit-PatchSet: 16 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-2001 Add UNDO size to tablet on-disk size
Todd Lipcon has posted comments on this change. Change subject: KUDU-2001 Add UNDO size to tablet on-disk size .. Patch Set 5: Code-Review-1 (2 comments) http://gerrit.cloudera.org:8080/#/c/6850/5//COMMIT_MSG Commit Message: Line 12: DeltaTracker::EstimateOnDiskSizeForCompaction() that this got renamed in the latest rev? nit: can you re-wrap this paragraph? seems like it's wrapped really small http://gerrit.cloudera.org:8080/#/c/6850/5/src/kudu/tablet/diskrowset.cc File src/kudu/tablet/diskrowset.cc: Line 692: return base_data_->EstimateOnDiskSize() + delta_tracker_->EstimateOnDiskSize(); since this now includes the UNDO deltas, whereas before it didn't, will this have a potential effect on compaction policy? We need to be quite careful with such a change to make sure it doesn't have unintended consequences. (eg what if a DRS including its undos is now larger than the whole compaction budget, but those UNDOs are non-GCable?) -- To view, visit http://gerrit.cloudera.org:8080/6850 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I59001adadb9a768a464e7b2cf0f0a5df0ef5393a Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: Yes
[kudu-CR] KUDU-1192 Periodically flush glog buffers from a thread
Todd Lipcon has posted comments on this change. Change subject: KUDU-1192 Periodically flush glog buffers from a thread .. Patch Set 5: (3 comments) Can you also add something like: google::SetCommandLineOptionWithMode("logbufsecs", "5", google::FlagSettingMode::SET_FLAGS_DEFAULT); in ParseCommandLineFlags() in flags.cc? I think changing our default to 5 secs instead of 30 is a good idea with this change, so that the messages are flushed a bit quicker. http://gerrit.cloudera.org:8080/#/c/6853/5/src/kudu/util/async_logger.cc File src/kudu/util/async_logger.cc: PS5, Line 120: regardless there "regardless whether there is" http://gerrit.cloudera.org:8080/#/c/6853/5/src/kudu/util/logging-test.cc File src/kudu/util/logging-test.cc: Line 175: // wait a little more than one wake up cycle nit: capitalize sentence and end in '.' nit: "wake-up" PS5, Line 180: flush nit: flushed -- To view, visit http://gerrit.cloudera.org:8080/6853 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id4c6d440e9259efcf222530f13137f7de5bf00fc Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: William LiGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: William Li Gerrit-HasComments: Yes
[kudu-CR] KUDU-1549: compact LBM container metadata files at startup
Adar Dembo has submitted this change and it was merged. Change subject: KUDU-1549: compact LBM container metadata files at startup .. KUDU-1549: compact LBM container metadata files at startup Here's another patch to speed up LBM startup by reducing the amount of metadata processed from disk. The idea is to find metadata files with lots of "dead" blocks (i.e. CREATE + DELETE pairs) and to "compact" them by rewriting the files without the dead blocks' records. The set of containers to compact is determined by the ratio of live blocks to total blocks, and there's a new gflag to configure that. To make this work, I had to adjust the accounting in BlockCreated() yet again. The new approach preserves next_block_offset_, but uses fs_aligned_length() directly for byte-related stats. Without this change, the aligned container stats (total_bytes and live_bytes_aligned) are completely out of whack in a container whose metadata was compacted. Like the dead container deletion patch, this one is quick and dirty in that the new logic is unsuitable for real-time compaction. Also like that patch, compaction in real-time would require non-trivial synchronization changes. Testing is done in several places: - BlockManagerStressTest: a "real" workload that'll compact some containers with additional inconsistencies injected by the LBMCorruptor. - BlockManagerTest.TestMetadataOkayDespiteFailedWrites: filesystem errors are injected into metadata compaction. - LogBlockManagerTest.TestCompactFullContainerMetadataAtStartup: explicit test for metadata compaction (and for the stat accounting fix). Change-Id: I981f7d9e7eb96fb40cef30ad96c5960b72d07756 Reviewed-on: http://gerrit.cloudera.org:8080/6826 Tested-by: Kudu Jenkins Reviewed-by: Todd Lipcon--- M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/fs_report.cc M src/kudu/fs/fs_report.h M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/util/env_posix.cc 8 files changed, 331 insertions(+), 44 deletions(-) Approvals: Todd Lipcon: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/6826 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I981f7d9e7eb96fb40cef30ad96c5960b72d07756 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] log block manager: reopen container metadata writers at the OS level
Adar Dembo has submitted this change and it was merged. Change subject: log block manager: reopen container metadata writers at the OS level .. log block manager: reopen container metadata writers at the OS level Another patch I'm working on compacts LBM metadata files at startup by writing compacted metadata into a temporary file, then overwriting the existing metadata file with it. For this to work properly, "reopening" a metadata file should actually reopen the file on the filesystem. This patch does just that. There should be no impact on any existing code; the reopening done after truncating a partial record from a metadata file is just as happy if done at the logical level (before) as at the physical level (now). By popular demand, I've attached below a long explanation of Kudu's approach to object initialization and thread safety that came up during code review. Because we don't use exceptions within Kudu, the initialization of an object is typically done in two phases: 1. Constructor, for operations that cannot fail. 2. Initializer function, for operations that may fail. In most objects, the initializer function is named Init(), returns a Status type, and needn't be thread safe because it is called right after the constructor, before the object is "made public" to other threads. Some objects offer two initializer functions: one to initialize a brand new object and one to initialize an object with existing state. The idea is that callers always use the constructor and then call one of the two initializer functions, depending on their use case. WritablePBContainerFile is one such object; Init() is for a brand new container file, and Open() is for a container file that already exists on disk. Previously, Reopen() served double duty: it was both the initializer function for files with existing state, and it was an arbitrary function (like Append()) that could be called at any time and by any thread. That second responsibility is why it was thread safe. Now, OpenExisting() is just the equivalent of CreateNew(), and so it makes sense for it to be just as thread unsafe as CreateNew() is. To be clear, the LBM was the only "arbitrary" caller to Reopen(), and the LBM didn't _need_ it to be thread safe. But, I think it was net less confusing for Reopen() to be thread safe (and thus equivalent in semantics to Append()) than for it to be thread unsafe (and thus exceptional when compared to Append(), Sync(), etc.). Change-Id: I6029d499422f5a2db036d796c7e208f2af71c394 Reviewed-on: http://gerrit.cloudera.org:8080/6825 Reviewed-by: Todd LipconTested-by: Kudu Jenkins Reviewed-by: David Ribeiro Alves --- M src/kudu/fs/log_block_manager-test-util.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/util/pb_util-test.cc M src/kudu/util/pb_util.cc M src/kudu/util/pb_util.h 6 files changed, 33 insertions(+), 40 deletions(-) Approvals: David Ribeiro Alves: Looks good to me, approved Todd Lipcon: Looks good to me, but someone else must approve Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/6825 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I6029d499422f5a2db036d796c7e208f2af71c394 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1549: compact LBM container metadata files at startup
Todd Lipcon has posted comments on this change. Change subject: KUDU-1549: compact LBM container metadata files at startup .. Patch Set 5: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/6826/2/src/kudu/fs/fs_report.h File src/kudu/fs/fs_report.h: Line 127: // Note: the BlockRecordPB is passed by pointer so that it can be swapped > I'm pining away for protobuf move constructors. yea, we should add -DPROTO_EXPERIMENTAL_ENABLE_MOVE to our protobuf build now that we are on protobuf 3.3. I'll do that in a separate patch while I'm thinking of it. http://gerrit.cloudera.org:8080/#/c/6826/2/src/kudu/fs/log_block_manager.h File src/kudu/fs/log_block_manager.h: PS2, Line 295: //files compacted. : // : > Correct: the LBM maintains a global map of LogBlocks, each of which is a li ok, not too worried then, will cross fingers and we can address if it becomes problematic for some reason -- To view, visit http://gerrit.cloudera.org:8080/6826 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I981f7d9e7eb96fb40cef30ad96c5960b72d07756 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1952 Remove round-robin for block placement
Adar Dembo has posted comments on this change. Change subject: KUDU-1952 Remove round-robin for block placement .. Patch Set 30: Code-Review+1 Verified+1 Test failure is a known flake. Todd/David, I think this is good to go as-is (and fixes KUDU-1952). Could one of you take another look? -- To view, visit http://gerrit.cloudera.org:8080/6636 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b Gerrit-PatchSet: 30 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1549: compact LBM container metadata files at startup
Todd Lipcon has posted comments on this change. Change subject: KUDU-1549: compact LBM container metadata files at startup .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/6826/2/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: PS2, Line 87: DEFINE_double(log_container_live_metadata_before_compact_ratio, 0.10, : "Desired ratio of live block metadata in log containers. If a " : "container's live to total block ratio dips below this value, " : "the container's metadata file will be compacted at startup."); > Thanks for the research. Yea, I think it would be good to capture the ideas above in a new JIRA. Definitely don't want to derail this work. -- To view, visit http://gerrit.cloudera.org:8080/6826 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I981f7d9e7eb96fb40cef30ad96c5960b72d07756 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar DemboGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] [rpc] introduce per-RPC credentials policy
Todd Lipcon has posted comments on this change. Change subject: [rpc] introduce per-RPC credentials policy .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/6875/5/src/kudu/rpc/negotiation.cc File src/kudu/rpc/negotiation.cc: PS5, Line 163: SatisfiesCredentialsPolicy > FWIW I like the PRIMARY name. This logic may end up changing if we go the yea, I think my main complaint about this is that we're calling SatisfiesCredentialsPolicy on a connection that has not yet negotiated, which seems a bit odd. To me the name of the method implies that we're checking whether the negotiated authentication of a connection satistifes some policy, but here we are deciding _how_ to negotiate based on its configured policy. -- To view, visit http://gerrit.cloudera.org:8080/6875 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I52f806e7b6f6362f66148530124e748e199ae6c2 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] [c++ client] re-acquire authn token if expired
Todd Lipcon has posted comments on this change. Change subject: [c++ client] re-acquire authn token if expired .. Patch Set 16: (9 comments) http://gerrit.cloudera.org:8080/#/c/6648/16/docs/known_issues.adoc File docs/known_issues.adoc: PS16, Line 172: one week is now supported only for Kudu C++ client nit: I think for the context of limitations, it's better to be explicit about the negative cases. i.e "is not supported by the Java client" or "is only supported by the C++ client but not by the Java client" or somesuch. http://gerrit.cloudera.org:8080/#/c/6648/16/docs/security.adoc File docs/security.adoc: PS16, Line 221: Java nit: insert "The" PS16, Line 223: yet nit: we decided at some point to not use these time-related words in the docs, since they imply some future timeline, but we aren't actually talking about that timeline. PS16, Line 223: C++ nit: the PS16, Line 224: client clients http://gerrit.cloudera.org:8080/#/c/6648/16/src/kudu/client/batcher.cc File src/kudu/client/batcher.cc: PS16, Line 415: existring typo http://gerrit.cloudera.org:8080/#/c/6648/16/src/kudu/client/client-internal.cc File src/kudu/client/client-internal.cc: PS16, Line 731: ANY_BUT_AUTHN_OTKEN this isn't the current name (though I actually like something like you have here better!) http://gerrit.cloudera.org:8080/#/c/6648/16/src/kudu/client/client-internal.h File src/kudu/client/client-internal.h: PS16, Line 165: rpc::CredentialsPolicy creds_policy = : rpc::CredentialsPolicy::ANY_CREDENTIALS); maybe it's just me, but I find this wrapping a bit odd. Maybe wrap all the params at the '(' instead so that the default param ends up further-indented from its param name? PS16, Line 237: scoped_refptr leader_master_rpc_any_creds_; : std::vector leader_master_callbacks_any_creds_; : scoped_refptr leader_master_rpc_primary_creds_; : std::vector leader_master_callbacks_primary_creds_; is it really necessary to split the callbacks? eg if you try to request a ConnectToCluster with primary-creds-only policy, but one is already running without that policy, the worst that happens is that the callback happens and, on your retry, you still don't have active tokens, and you have to reconnect again, right? i.e this could only be transient and only be transient once in a rare case -- To view, visit http://gerrit.cloudera.org:8080/6648 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5 Gerrit-PatchSet: 16 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] [test-workload] setter for admin operation timeout
Alexey Serbin has submitted this change and it was merged. Change subject: [test-workload] setter for admin operation timeout .. [test-workload] setter for admin operation timeout Added a setter for client's default admin operation timeout. It sets corresponding property for the aggregated client builder. Change-Id: I3b972c3edcc01b1a3c074958a5f2f0b3f64f608c Reviewed-on: http://gerrit.cloudera.org:8080/6887 Tested-by: Kudu Jenkins Reviewed-by: Dan Burkert--- M src/kudu/integration-tests/test_workload.h 1 file changed, 4 insertions(+), 0 deletions(-) Approvals: Dan Burkert: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/6887 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I3b972c3edcc01b1a3c074958a5f2f0b3f64f608c Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] [tests] operations timeout for ClusterVerifier
Alexey Serbin has submitted this change and it was merged. Change subject: [tests] operations timeout for ClusterVerifier .. [tests] operations timeout for ClusterVerifier Change-Id: I90828d0bd91809fe541782f4e5ded06cbae51873 Reviewed-on: http://gerrit.cloudera.org:8080/6888 Tested-by: Alexey SerbinReviewed-by: Dan Burkert --- M src/kudu/integration-tests/cluster_verifier.cc M src/kudu/integration-tests/cluster_verifier.h 2 files changed, 21 insertions(+), 10 deletions(-) Approvals: Dan Burkert: Looks good to me, approved Alexey Serbin: Verified -- To view, visit http://gerrit.cloudera.org:8080/6888 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I90828d0bd91809fe541782f4e5ded06cbae51873 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert
[kudu-CR] [rpc] introduce per-RPC credentials policy
Dan Burkert has posted comments on this change. Change subject: [rpc] introduce per-RPC credentials policy .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/6875/5/src/kudu/rpc/negotiation.cc File src/kudu/rpc/negotiation.cc: PS5, Line 163: SatisfiesCredentialsPolicy > this seems a little hard to follow. This makes it sound like "Are primary c FWIW I like the PRIMARY name. This logic may end up changing if we go the route of storing the authentication type in the connection instead of the policy, in which case I think the policy will get passed in directly here, which would make the check more like your equality check. -- To view, visit http://gerrit.cloudera.org:8080/6875 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I52f806e7b6f6362f66148530124e748e199ae6c2 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs
Todd Lipcon has posted comments on this change. Change subject: KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs .. Patch Set 34: (11 comments) http://gerrit.cloudera.org:8080/#/c/6514/34/src/kudu/rpc/server_negotiation.cc File src/kudu/rpc/server_negotiation.cc: PS34, Line 66: allows nit: "to allow" Line 95: vector* g_local_networks = new vector(); I think it's better to initialize this to nullptr, and then set it to the vector only after parsing/initializing the value. That way if we have a bug where we accidentally use it before it's been properly set, we'd get a crash instead of silent wrong results. Also, please wrap this in an anonymous namespace so that the symbol doesn't get exported to other translation units PS34, Line 209: NegotiatePB request; : RETURN_NOT_OK(RecvNegotiatePB(, _buf)); can you add a comment why we need to consume this? not quite following, since 'request' is unused PS34, Line 212: or unauthenticated this error can be more specific to unencrypted, right? PS34, Line 729: unencrypted or unauthenticated this one can be specific to unauthenticated Line 914: std::call_once(once, [] { the initialization of g_local_networks could be moved inside the 'if' statement below on line 925. this would help us provide a workaround in the case that, for some reason, GetLocalNetworks crashes, we can configure explicit networks and avoid calling it at all. PS34, Line 915: GetLocalNetworks does this need WARN_NOT_OK or something? PS34, Line 919: KUDU_C can just be CHECK_OK (the 'KUDU_...' variants are only used in the context of client headers) PS34, Line 926: = can use |= here instead of the || on line 928 http://gerrit.cloudera.org:8080/#/c/6514/34/src/kudu/util/net/net_util.h File src/kudu/util/net/net_util.h: PS34, Line 95: is nit: are PS34, Line 113: vector Network nit: "vector of" -- To view, visit http://gerrit.cloudera.org:8080/6514 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6c3fbb5491785874c5701d6c9d866949cfac905e Gerrit-PatchSet: 34 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Harsh J Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1826 add propagated timestamp to sync client
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6798 to look at the new patch set (#4). Change subject: KUDU-1826 add propagated timestamp to sync client .. KUDU-1826 add propagated timestamp to sync client Updating and getting the propagated timestamp only exist as functions of the AsyncKuduClient, but these functions are still useful to the KuduClient. This patch exposes AsyncKuduClient's updateLastPropagatedTimestamp() and getLastPropagatedTimestamp() to the KuduClient. The static value NO_TIMESTAMP is also added to the KuduClient so timestamps returned by KuduClient.getLastPropagatedTimestamp() can be compared with KuduClient.NO_TIMESTAMP. Change-Id: Ibbfeb9526423f5acd0ef6482869d36cdd146e921 --- M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java M java/kudu-client/src/main/java/org/apache/kudu/client/KuduClient.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java 3 files changed, 51 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/98/6798/4 -- To view, visit http://gerrit.cloudera.org:8080/6798 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibbfeb9526423f5acd0ef6482869d36cdd146e921 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] WIP KUDU-1826 add propagated timestamp to sync client
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6798 to look at the new patch set (#3). Change subject: WIP KUDU-1826 add propagated timestamp to sync client .. WIP KUDU-1826 add propagated timestamp to sync client Updating and getting the propagated timestamp only exist as functions of the AsyncKuduClient, but these functions are still useful to the KuduClient. This patch exposes AsyncKuduClient's updateLastPropagatedTimestamp() and getLastPropagatedTimestamp() to the KuduClient. The static value NO_TIMESTAMP is also added to the KuduClient so timestamps returned by KuduClient.getLastPropagatedTimestamp() can be compared with KuduClient.NO_TIMESTAMP. Change-Id: Ibbfeb9526423f5acd0ef6482869d36cdd146e921 --- M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java M java/kudu-client/src/main/java/org/apache/kudu/client/KuduClient.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java 3 files changed, 51 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/98/6798/3 -- To view, visit http://gerrit.cloudera.org:8080/6798 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibbfeb9526423f5acd0ef6482869d36cdd146e921 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] KUDU-1860: ksck doesn't identify tablets that are evicted but still in config
Will Berkeley has posted comments on this change. Change subject: KUDU-1860: ksck doesn't identify tablets that are evicted but still in config .. Patch Set 4: (8 comments) http://gerrit.cloudera.org:8080/#/c/6772/5/src/kudu/tools/ksck-test.cc File src/kudu/tools/ksck-test.cc: Line 70: const Schema& schema, > warning: parameter 'tablet_id' is unused [misc-unused-parameters] Expected Line 71: const ChecksumOptions& options, > warning: parameter 'schema' is unused [misc-unused-parameters] Expected Line 242: Status RunKsck() { > warning: non-const reference parameter 'replicas', make it const or use a p Done http://gerrit.cloudera.org:8080/#/c/6772/5/src/kudu/tools/ksck.cc File src/kudu/tools/ksck.cc: Line 798: } // namespace kudu > error: cannot initialize a parameter of type 'FILE *' (aka '_IO_FILE *') wi Done Line 798: } // namespace kudu > error: use of undeclared identifier 'setw'; did you mean 'getw'? [clang-dia Done Line 798: } // namespace kudu > error: use of undeclared identifier 'setw'; did you mean 'getw'? [clang-dia Done http://gerrit.cloudera.org:8080/#/c/6772/5/src/kudu/tools/ksck.h File src/kudu/tools/ksck.h: Line 104: return term_; > warning: value argument 'opid_index' can be moved to avoid copy [misc-move- Done http://gerrit.cloudera.org:8080/#/c/6772/5/src/kudu/tserver/tablet_service.cc File src/kudu/tserver/tablet_service.cc: Line 115: using kudu::consensus::LeaderStepDownResponsePB; > warning: using decl 'RaftConfigPB' is unused [misc-unused-using-decls] Done -- To view, visit http://gerrit.cloudera.org:8080/6772 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I16e4de09821b372c3773b4ade3fd9e37ab818808 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: Yes
[kudu-CR] [tests] operations timeout for ClusterVerifier
Dan Burkert has posted comments on this change. Change subject: [tests] operations timeout for ClusterVerifier .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6888 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I90828d0bd91809fe541782f4e5ded06cbae51873 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-HasComments: No
[kudu-CR] KUDU-1860: ksck doesn't identify tablets that are evicted but still in config
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6772 to look at the new patch set (#6). Change subject: KUDU-1860: ksck doesn't identify tablets that are evicted but still in config .. KUDU-1860: ksck doesn't identify tablets that are evicted but still in config This patch enhances ksck to gather consensus info from every tablet. It compares this info with master and outputs the master's config and every conflicting config, if there are any conflicts. To do this efficiently it implements a new GetAllConsensusStates RPC that gathers info about every replica's consensus state. This will catch at least the two problems identified in KUDU-1860: 1. The leader has a pending config to remove a tablet, but it is not committed so the master does not see this config. This can hide an unhealthy tablet if, e.g., one pending config member is down and the pending-to-be-kicked-out member is up, so 1/2 replicas are alive in the leader's active config but the master thinks 2/3 are alive. 2. No replica is leader but the master believes there is a leader because its cache is old and hasn't been updated. Sample output showing #1: https://gist.github.com/wdberkeley/f964439bb0c407d3769c2dd9a0959ca0 Change-Id: I16e4de09821b372c3773b4ade3fd9e37ab818808 --- M src/kudu/consensus/consensus.proto M src/kudu/tools/ksck-test.cc M src/kudu/tools/ksck.cc M src/kudu/tools/ksck.h M src/kudu/tools/ksck_remote.cc M src/kudu/tools/ksck_remote.h M src/kudu/tools/tool_action_cluster.cc M src/kudu/tserver/tablet_service.cc M src/kudu/tserver/tablet_service.h 9 files changed, 412 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/72/6772/6 -- To view, visit http://gerrit.cloudera.org:8080/6772 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I16e4de09821b372c3773b4ade3fd9e37ab818808 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley
[kudu-CR] [webui] Improvements for when there's many tablets & cleanup
Dan Burkert has posted comments on this change. Change subject: [webui] Improvements for when there's many tablets & cleanup .. Patch Set 3: OK well I'm in favor of 'write buffer memory usage'. I just think it's really likely that people will look at the page, add up the amounts, and get confused when it's only 1/2 of the server memory usage. -- To view, visit http://gerrit.cloudera.org:8080/6870 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic3904a4b0fbb6446615cd46c8a6f30f81c832c53 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: No
[kudu-CR] WIP Don't suicide on EIO
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6773 to look at the new patch set (#11). Change subject: WIP Don't suicide on EIO .. WIP Don't suicide on EIO Rather than suiciding when reaching an EIO, this patch adds a mechanism that triggers error-handling in the form of a callback. This handler is attached to the lowest non-env operations that may result in EIO. E.g. PosixRWFile::Write() is an env operation that may result in an EIO. LogBlockContainer::WriteData()'s call to it must be wrapped in the new error-handling macro KUDU_RETURN_OR_HANDLE_EIO. Thus, all direct readers/writers of files must now implement EIO-handling code and wrap disk IO in KUDU_RETURN_OR_HANDLE_EIO. Also included is a new tablet data state called TABLET_DATA_CORRUPTED in which tablet data will not be deleted until explicitly requested or during a tablet copy. This patch is marked WIP, as the correct behavior after a disk fails is not yet included. For now, upon failure, the block managers will keep track of the dead disks and will mark the appropriate tablet peers as failed. Additionally, the error handling has been moved between layers offline, so more cleanup is being done to align with the above description. Change-Id: I4c221a5ea97c7b7ec5f8d395b5527dc16e63f828 --- M src/kudu/consensus/consensus_peers.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/block_manager.h M src/kudu/fs/data_dirs.cc M src/kudu/fs/data_dirs.h A src/kudu/fs/error_manager.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/file_block_manager.h M src/kudu/fs/fs_manager.cc M src/kudu/fs/fs_manager.h M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/master/sys_catalog.cc M src/kudu/tablet/metadata.proto M src/kudu/tablet/tablet.cc M src/kudu/tablet/tablet_metadata.cc M src/kudu/tablet/tablet_metadata.h M src/kudu/tserver/tablet_copy_client.cc M src/kudu/tserver/tablet_server-test.cc M src/kudu/tserver/ts_tablet_manager.cc M src/kudu/tserver/ts_tablet_manager.h M src/kudu/util/status.h 22 files changed, 354 insertions(+), 56 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/73/6773/11 -- To view, visit http://gerrit.cloudera.org:8080/6773 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4c221a5ea97c7b7ec5f8d395b5527dc16e63f828 Gerrit-PatchSet: 11 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] [test-workload] setter for admin operation timeout
Dan Burkert has posted comments on this change. Change subject: [test-workload] setter for admin operation timeout .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6887 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3b972c3edcc01b1a3c074958a5f2f0b3f64f608c Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR] WIP Allow external miniclusters to use many data dirs
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6845 to look at the new patch set (#7). Change subject: WIP Allow external miniclusters to use many data dirs .. WIP Allow external miniclusters to use many data dirs In order to test different disk configurations, it is becoming increasingly important to have end-to-end testing with nodes backed by multiple directories. This patch adds this functionality to the ExternalMiniCluster class, which now supports the 'num_dirs_per_tserver' option, and the ExternalDaemon class, which now supports a list of data dirs instead of a single data dir. The ExternalMiniCluster will create the ExternalDaemon with a list of data dirs that reflects 'num_dirs_per_tserver' by using the original data dir name as a prefix and appending an integer (e.g. '/dir_name' with 'num_dirs_per_tserver = 2' will result in '/dir_name-0' and '/dir_name-1'). A new test called disk-failure-itest is added that exercises this. WIP: EIO-handling patch must be completed for further testing to make sense. Change-Id: Id2f5def6980ad394c8558ad97ba830f1b0257332 --- M src/kudu/fs/file_block_manager.cc M src/kudu/integration-tests/CMakeLists.txt A src/kudu/integration-tests/disk-failure-itest.cc M src/kudu/integration-tests/external_mini_cluster.cc M src/kudu/integration-tests/external_mini_cluster.h M src/kudu/integration-tests/ts_itest-base.h M src/kudu/tablet/tablet_replica_mm_ops.cc M src/kudu/util/path_util.cc M src/kudu/util/path_util.h M src/kudu/util/status.h 10 files changed, 282 insertions(+), 28 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/45/6845/7 -- To view, visit http://gerrit.cloudera.org:8080/6845 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id2f5def6980ad394c8558ad97ba830f1b0257332 Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] Add fault injection of EIOs
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6881 to look at the new patch set (#3). Change subject: Add fault injection of EIOs .. Add fault injection of EIOs This patch adds the functionality to inject EIOs in env_posix based on a flag-specified ECMAScript regex determining which paths should fail and a flag-specified double indicating how often these failures should occur. Change-Id: I5190d5e7f296cf27b65f1ecd2c16f3315cc46a39 --- M src/kudu/util/env_posix.cc M src/kudu/util/fault_injection.cc M src/kudu/util/fault_injection.h 3 files changed, 93 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/81/6881/3 -- To view, visit http://gerrit.cloudera.org:8080/6881 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5190d5e7f296cf27b65f1ecd2c16f3315cc46a39 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot
[kudu-CR] [webui] Improvements for when there's many tablets & cleanup
Adar Dembo has posted comments on this change. Change subject: [webui] Improvements for when there's many tablets & cleanup .. Patch Set 3: > Adar, do you have any opinion on 'write buffer memory usage' vs > plain 'memory usage'? Not beyond bikeshedding, no. :) -- To view, visit http://gerrit.cloudera.org:8080/6870 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic3904a4b0fbb6446615cd46c8a6f30f81c832c53 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: No
[kudu-CR] [webui] Improvements for when there's many tablets & cleanup
Dan Burkert has posted comments on this change. Change subject: [webui] Improvements for when there's many tablets & cleanup .. Patch Set 3: Adar, do you have any opinion on 'write buffer memory usage' vs plain 'memory usage'? -- To view, visit http://gerrit.cloudera.org:8080/6870 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic3904a4b0fbb6446615cd46c8a6f30f81c832c53 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: No
[kudu-CR] [rpc] introduce per-RPC credentials policy
Dan Burkert has posted comments on this change. Change subject: [rpc] introduce per-RPC credentials policy .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/6875/5/src/kudu/rpc/negotiation.cc File src/kudu/rpc/negotiation.cc: Line 215: conn->set_remote_features(client_negotiation.take_server_features()); I think we're being pessimistic here in a very certain case: The connection is created with CredentialsPolicy::ANY, but there currently is no token. In reality, that connection will end up using Kerberos if it's connecting to secure server. In that case, we could 'upgrade' ANY to PRIMARY. This could be done here by directly setting it on the connection, or you could store the negotiated AuthenticationType into the Connection instead of storing the CredentialsPolicy. I think this may be a little cleaner in the long run. http://gerrit.cloudera.org:8080/#/c/6875/5/src/kudu/rpc/reactor.cc File src/kudu/rpc/reactor.cc: PS5, Line 368: for (auto it = range.first; it != range.second; ++it) { : const auto& c = it->second; : if (c->scheduled_for_shutdown()) { : // This connection is not to be used for any new calls. : continue; : } : if (!c->SatisfiesCredentialsPolicy(cred_policy)) { : // Do not use a connection with a non-compliant credentials policy. : // Instead, open a new one, while marking the former as scheduled for : // shutdown. This process converges: any connection that satisfies the : // PRIMARY_CREDENTIALS policy automatically satisfies the ANY_CREDENTIALS : // policy as well. The idea is to keep only one usable connection identified : // by the specified 'conn_id'. : c->set_scheduled_for_shutdown(); : continue; : } : : // If the test-only 'one-connection-per-RPC' mode is enabled, connections : // are re-established at every RPC call. : if (PREDICT_FALSE(FLAGS_rpc_reopen_outbound_connections)) { : c->set_scheduled_for_shutdown(); : continue; : } : } : // Second pass: shutdown idle connections to the target destination if they : // have been marked. Non-idle connections will be taken care of later by the : // idle connection scanner. : scoped_refptr found_conn; : for (auto it = range.first; it != range.second;) { : const auto& c = it->second; : if (c->scheduled_for_shutdown()) { : if (c->Idle()) { : DCHECK_EQ(Connection::CLIENT, c->direction()); : c->Shutdown(Status::NetworkError("connection is closed due to non-reuse policy")); : it = client_conns_.erase(it); : continue; : } : } else { : DCHECK(!found_conn); : found_conn = c; : // Appropriate connection is found; continue further to take care of the : // rest of connections marked for shutdown by the first pass above. : } : ++it; : } I think these loops can be simplified by combining them: scoped_refptr found_conn; for (auto it = range.first; it != range.second;) { const auto& c = it->second; if (c->scheduled_for_shutdown() || !c->SatisfiesCredentialsPolicy(cred_policy) || PREDICT_FALSE(FLAGS_rpc_reopen_outbound_connections)) { // Shutdown idle connections. Non-idle connections will be taken care of // later by the idle connection scanner. if (c->Idle()) { DCHECK_EQ(Connection::CLIENT, c->direction()); c->Shutdown(Status::NetworkError("connection is closed due to non-reuse policy")); it = client_conns_.erase(it); continue; } c->set_scheduled_for_shutdown(); } else { DCHECK(!found_conn); found_conn = c; } ++it; } http://gerrit.cloudera.org:8080/#/c/6875/5/src/kudu/rpc/rpc-test.cc File src/kudu/rpc/rpc-test.cc: PS5, Line 295: PRC RPC -- To view, visit http://gerrit.cloudera.org:8080/6875 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I52f806e7b6f6362f66148530124e748e199ae6c2 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] [webui] Improvements for when there's many tablets & cleanup
Adar Dembo has posted comments on this change. Change subject: [webui] Improvements for when there's many tablets & cleanup .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/6870/3//COMMIT_MSG Commit Message: Line 29: Also I added a memory usage column to the detailed tablets > That's definitely correct for the block cache, but I'll let Adar answer for AFAIK WAL buffers aren't tracked by memtrackers. The log index cache is, but its tracker isn't parented to the tablet's tracker. -- To view, visit http://gerrit.cloudera.org:8080/6870 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic3904a4b0fbb6446615cd46c8a6f30f81c832c53 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: Yes
[kudu-CR] [java client] removes duplicate checks at KuduScanToken
Todd Lipcon has submitted this change and it was merged. Change subject: [java client] removes duplicate checks at KuduScanToken .. [java client] removes duplicate checks at KuduScanToken KUDU-579 added fault tolerant scanner support at java client. However,it missed TODO for KUDU-1040 and results in duplicate code at KuduScanToken. This patch removes it. Change-Id: I7e036d5fe333009a74294d27c54fe846cdc404e9 Reviewed-on: http://gerrit.cloudera.org:8080/6882 Tested-by: Kudu Jenkins Reviewed-by: Todd Lipcon--- M java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java 1 file changed, 0 insertions(+), 4 deletions(-) Approvals: Todd Lipcon: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/6882 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I7e036d5fe333009a74294d27c54fe846cdc404e9 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao Hao Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] [java client] removes duplicate checks at KuduScanToken
Todd Lipcon has posted comments on this change. Change subject: [java client] removes duplicate checks at KuduScanToken .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/6882 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7e036d5fe333009a74294d27c54fe846cdc404e9 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1860: ksck doesn't identify tablets that are evicted but still in config
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6772 to look at the new patch set (#5). Change subject: KUDU-1860: ksck doesn't identify tablets that are evicted but still in config .. KUDU-1860: ksck doesn't identify tablets that are evicted but still in config This patch enhances ksck to gather consensus info from every tablet. It compares this info with master and outputs the master's config and every conflicting config, if there are any conflicts. To do this efficiently it implements a new GetAllConsensusStates RPC that gathers info about every replica's consensus state. This will catch at least the two problems identified in KUDU-1860: 1. The leader has a pending config to remove a tablet, but it is not committed so the master does not see this config. This can hide an unhealthy tablet if, e.g., one pending config member is down and the pending-to-be-kicked-out member is up, so 1/2 replicas are alive in the leader's active config but the master thinks 2/3 are alive. 2. No replica is leader but the master believes there is a leader because its cache is old and hasn't been updated. Sample output showing #1: https://gist.github.com/wdberkeley/f964439bb0c407d3769c2dd9a0959ca0 Change-Id: I16e4de09821b372c3773b4ade3fd9e37ab818808 --- M src/kudu/consensus/consensus.proto M src/kudu/tools/ksck-test.cc M src/kudu/tools/ksck.cc M src/kudu/tools/ksck.h M src/kudu/tools/ksck_remote.cc M src/kudu/tools/ksck_remote.h M src/kudu/tools/tool_action_cluster.cc M src/kudu/tserver/tablet_service.cc M src/kudu/tserver/tablet_service.h 9 files changed, 408 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/72/6772/5 -- To view, visit http://gerrit.cloudera.org:8080/6772 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I16e4de09821b372c3773b4ade3fd9e37ab818808 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Will Berkeley
[kudu-CR] spark: add support for fault tolerant scanner
Dan Burkert has submitted this change and it was merged. Change subject: spark: add support for fault tolerant scanner .. spark: add support for fault tolerant scanner This adds support to use fault tolerant scanner for spark job. By default non fault tolerant scanner is used. To turn on fault tolerant scanner, use job config: 'kudu.faultTolerantScan'. Change-Id: I3f3192025ca5d74197600480fd3d040d70b4bbc2 Reviewed-on: http://gerrit.cloudera.org:8080/6782 Reviewed-by: Dan BurkertTested-by: Kudu Jenkins --- M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala 4 files changed, 37 insertions(+), 8 deletions(-) Approvals: Dan Burkert: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/6782 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I3f3192025ca5d74197600480fd3d040d70b4bbc2 Gerrit-PatchSet: 10 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao Hao Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] [webui] Improvements for when there's many tablets & cleanup
Dan Burkert has posted comments on this change. Change subject: [webui] Improvements for when there's many tablets & cleanup .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/6870/3//COMMIT_MSG Commit Message: Line 13: 2. /tablets has a toggle-collapse detailed table of tablets. > There's 4 possible elements in the tablets section I see, thanks for the clarification. Line 29: Also I added a memory usage column to the detailed tablets > It's not too jargony if it's correct and avoids the number being misleading That's definitely correct for the block cache, but I'll let Adar answer for WAL usage since I'm not sure. -- To view, visit http://gerrit.cloudera.org:8080/6870 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic3904a4b0fbb6446615cd46c8a6f30f81c832c53 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: Yes
[kudu-CR] KUDU-1952 Remove round-robin for block placement
Andrew Wong has posted comments on this change. Change subject: KUDU-1952 Remove round-robin for block placement .. Patch Set 29: (4 comments) http://gerrit.cloudera.org:8080/#/c/6636/24/src/kudu/fs/data_dirs.cc File src/kudu/fs/data_dirs.cc: PS24, Line 412: other > nit: in other places in this file this kind of statement is written as 'if Done PS24, Line 529: data_dir_set.insert(group_indices.begin(), group_indices.end()); : for (auto& e : data_dir_by_uuid_idx_) { : > nit: since it's C++11, consider Neat! Done PS24, Line 532: URN_ > nit: auto& ? Done http://gerrit.cloudera.org:8080/#/c/6636/24/src/kudu/fs/data_dirs.h File src/kudu/fs/data_dirs.h: PS24, Line 20: #include > nit: per code style this should be placed after std and before kudu headers Done -- To view, visit http://gerrit.cloudera.org:8080/6636 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b Gerrit-PatchSet: 29 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1952 Remove round-robin for block placement
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6636 to look at the new patch set (#29). Change subject: KUDU-1952 Remove round-robin for block placement .. KUDU-1952 Remove round-robin for block placement This is the first of a multi-patch patchset to mitigate the effects of single-disk failure. Throughout the code, the term "DataDir" refers to a data directory, which is often mounted on a distinct disk. Thus, "disks" and "data directories" will be used interchangeably. This patch adds a mapping from tablet to a set of disks and uses it to replace the existing round-robin placement of blocks. Tablets are mapped to a fixed number of disks (i.e. a DataDirGroup). New blocks are placed randomly in directories within each tablet's DataDirGroup. Tablet-to-group mappings are generated and stored as metadata upon tablet creation, or upon tablet replacement during a tablet copy. During group creation, disks are added to groups by randomly selecting two available directories and selecting the one with fewer tablets on it. This avoids pigeonholing new tablets to disks with relatively few tablets, while still trending towards filling underloaded disks. Groups are maintained when restarting the server, as they are flushed with metadata, and are deleted upon tablet deletion. When loading tablet data from a previous version of Kudu, the tablet's metadata will not have a DataDirGroup. One will be generated containing all data directories, as the tablet's data may already be spread across any number of disks. As this patch only addresses block placement, it does not itself mitigate single-disk failure. Given this, and given the tradeoff between I/O and failure disk-failure tolerance, the default behavior will be to spread tablet data across all disks. Testing is done at the block manager level in block_manager-test and log_block_manager-test, as well as in the new data_dirs-test. A design doc can be found here: https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b --- M src/kudu/cfile/bloomfile-test-base.h M src/kudu/cfile/cfile-test-base.h M src/kudu/cfile/cfile-test.cc M src/kudu/fs/CMakeLists.txt M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/block_manager.h A src/kudu/fs/data_dirs-test.cc M src/kudu/fs/data_dirs.cc M src/kudu/fs/data_dirs.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/file_block_manager.h M src/kudu/fs/fs.proto M src/kudu/fs/fs_manager-test.cc M src/kudu/fs/fs_manager.cc M src/kudu/fs/fs_manager.h M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/tablet/delta_compaction-test.cc M src/kudu/tablet/delta_compaction.cc M src/kudu/tablet/delta_compaction.h M src/kudu/tablet/delta_tracker.cc M src/kudu/tablet/deltafile-test.cc M src/kudu/tablet/deltamemstore-test.cc M src/kudu/tablet/diskrowset.cc M src/kudu/tablet/metadata.proto M src/kudu/tablet/multi_column_writer.cc M src/kudu/tablet/multi_column_writer.h M src/kudu/tablet/tablet_bootstrap-test.cc M src/kudu/tablet/tablet_metadata.cc M src/kudu/tablet/tablet_metadata.h M src/kudu/tools/kudu-tool-test.cc M src/kudu/tserver/tablet_copy_client-test.cc M src/kudu/tserver/tablet_copy_client.cc M src/kudu/tserver/tablet_copy_client.h M src/kudu/tserver/ts_tablet_manager.cc M src/kudu/tserver/ts_tablet_manager.h 38 files changed, 991 insertions(+), 190 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/6636/29 -- To view, visit http://gerrit.cloudera.org:8080/6636 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b Gerrit-PatchSet: 29 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1952 Remove round-robin for block placement
Alexey Serbin has posted comments on this change. Change subject: KUDU-1952 Remove round-robin for block placement .. Patch Set 26: (4 comments) Just skimmed through. I need to get better understanding of the background to get more thoughtful feedback -- I hope I'll do one more pass this week. http://gerrit.cloudera.org:8080/#/c/6636/24/src/kudu/fs/data_dirs.cc File src/kudu/fs/data_dirs.cc: PS24, Line 412: other nit: in other places in this file this kind of statement is written as 'if (other != nullptr)'. Consider unifying this among those places. PS24, Line 529: for (uint16_t uuid : group_indices) { : data_dir_set.insert(uuid); : } nit: since it's C++11, consider data_dir_set.insert(group_indices.begin(), group_indices.end()); PS24, Line 532: auto nit: auto& ? http://gerrit.cloudera.org:8080/#/c/6636/24/src/kudu/fs/data_dirs.h File src/kudu/fs/data_dirs.h: PS24, Line 20: #include nit: per code style this should be placed after std and before kudu headers; like it is in data_dirs.cc https://google.github.io/styleguide/cppguide.html#Names_and_Order_of_Includes -- To view, visit http://gerrit.cloudera.org:8080/6636 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b Gerrit-PatchSet: 26 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1422 [java client] modifiable error collector capacity
eric-maynard has posted comments on this change. Change subject: KUDU-1422 [java client] modifiable error collector capacity .. Patch Set 5: The build error seems unrelated (?): 09:29:53 Error: chose Boost outside 09:29:53 /home/jenkins-slave/workspace/kudu-master/0/thirdparty/installed/common: 09:29:53 /home/jenkins-slave/workspace/kudu-master/0/thirdparty/installed/uninstrumented/include 09:29:53 09:29:53 09:29:53 -- Configuring incomplete, errors occurred! -- To view, visit http://gerrit.cloudera.org:8080/5291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iae382050eb5eaf9c169dd3ef7ed4dbf5b5f1c334 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: eric-maynardGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: eric-maynard Gerrit-HasComments: No
[kudu-CR] KUDU-1422 [java client] modifiable error collector capacity
eric-maynard has posted comments on this change. Change subject: KUDU-1422 [java client] modifiable error collector capacity .. Patch Set 5: Added trimming when error queue capacity is decreased -- To view, visit http://gerrit.cloudera.org:8080/5291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iae382050eb5eaf9c169dd3ef7ed4dbf5b5f1c334 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: eric-maynardGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: eric-maynard Gerrit-HasComments: No
[kudu-CR] KUDU-1952 Remove round-robin for block placement
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6636 to look at the new patch set (#28). Change subject: KUDU-1952 Remove round-robin for block placement .. KUDU-1952 Remove round-robin for block placement This is the first of a multi-patch patchset to mitigate the effects of single-disk failure. Throughout the code, the term "DataDir" refers to a data directory, which is often mounted on a distinct disk. Thus, "disks" and "data directories" will be used interchangeably. This patch adds a mapping from tablet to a set of disks and uses it to replace the existing round-robin placement of blocks. Tablets are mapped to a fixed number of disks (i.e. a DataDirGroup). New blocks are placed randomly in directories within each tablet's DataDirGroup. Tablet-to-group mappings are generated and stored as metadata upon tablet creation, or upon tablet replacement during a tablet copy. During group creation, disks are added to groups by randomly selecting two available directories and selecting the one with fewer tablets on it. This avoids pigeonholing new tablets to disks with relatively few tablets, while still trending towards filling underloaded disks. Groups are maintained when restarting the server, as they are flushed with metadata, and are deleted upon tablet deletion. When loading tablet data from a previous version of Kudu, the tablet's metadata will not have a DataDirGroup. One will be generated containing all data directories, as the tablet's data may already be spread across any number of disks. As this patch only addresses block placement, it does not itself mitigate single-disk failure. Given this, and given the tradeoff between I/O and failure disk-failure tolerance, the default behavior will be to spread tablet data across all disks. Testing is done at the block manager level in block_manager-test and log_block_manager-test, as well as in the new data_dirs-test. A design doc can be found here: https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b --- M src/kudu/cfile/bloomfile-test-base.h M src/kudu/cfile/cfile-test-base.h M src/kudu/cfile/cfile-test.cc M src/kudu/fs/CMakeLists.txt M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/block_manager.h A src/kudu/fs/data_dirs-test.cc M src/kudu/fs/data_dirs.cc M src/kudu/fs/data_dirs.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/file_block_manager.h M src/kudu/fs/fs.proto M src/kudu/fs/fs_manager-test.cc M src/kudu/fs/fs_manager.cc M src/kudu/fs/fs_manager.h M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/tablet/delta_compaction-test.cc M src/kudu/tablet/delta_compaction.cc M src/kudu/tablet/delta_compaction.h M src/kudu/tablet/delta_tracker.cc M src/kudu/tablet/deltafile-test.cc M src/kudu/tablet/deltamemstore-test.cc M src/kudu/tablet/diskrowset.cc M src/kudu/tablet/metadata.proto M src/kudu/tablet/multi_column_writer.cc M src/kudu/tablet/multi_column_writer.h M src/kudu/tablet/tablet_bootstrap-test.cc M src/kudu/tablet/tablet_metadata.cc M src/kudu/tablet/tablet_metadata.h M src/kudu/tools/kudu-tool-test.cc M src/kudu/tserver/tablet_copy_client-test.cc M src/kudu/tserver/tablet_copy_client.cc M src/kudu/tserver/tablet_copy_client.h M src/kudu/tserver/ts_tablet_manager.cc M src/kudu/tserver/ts_tablet_manager.h 38 files changed, 992 insertions(+), 190 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/6636/28 -- To view, visit http://gerrit.cloudera.org:8080/6636 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b Gerrit-PatchSet: 28 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] [webui] Improvements for when there's many tablets & cleanup
Will Berkeley has posted comments on this change. Change subject: [webui] Improvements for when there's many tablets & cleanup .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/6870/3//COMMIT_MSG Commit Message: Line 13: 2. /tablets has a toggle-collapse detailed table of tablets. > I'm not sure this is an improvement, since there is no content under the ta There's 4 possible elements in the tablets section 1. summary of live tablets 2. table of live tablets 3. summary of tombstoned tablet 4. table of tombstoned tablets so having the tablets tables collapsed shows the 1-2 summaries. Previously, if there were tombstoned tablets, info about them would be hidden below the live tablets table. Line 29: Also I added a memory usage column to the detailed tablets > I'm concerned it may be misleading to call this 'memory usage', since it on It's not too jargony if it's correct and avoids the number being misleading. But I'm not familiar with how the mem trackers are working-- is it only tracking the write buffers because block caches, logs, etc are owned by different trackers? -- To view, visit http://gerrit.cloudera.org:8080/6870 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic3904a4b0fbb6446615cd46c8a6f30f81c832c53 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will BerkeleyGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: Yes
[kudu-CR] KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs
Hao Hao has posted comments on this change. Change subject: KUDU-1875: Refuse unauthenticated connections from publicly routable IP addrs .. Patch Set 34: (9 comments) http://gerrit.cloudera.org:8080/#/c/6514/32/src/kudu/rpc/server_negotiation.cc File src/kudu/rpc/server_negotiation.cc: Line 60: DEFINE_string(trusted_subnets, > We discussed offline about this, just going to write up what we decided: Done PS32, Line 65: bnets of all local network interfaces w > I think this sentence will read better as "Set it to '0.0.0.0/0' to allow u Done Line 76: > It's typical to make the iteration variable a 'const auto&', which means th Done PS32, Line 212: rypted > nit: publicly Done Line 728: if (!IsTrustedConnection(addr)) { > Same here Done Line 916: }); > You can probably just wrap this in a KUDU_CHECK_OK call. It shouldn't ever Done http://gerrit.cloudera.org:8080/#/c/6514/32/src/kudu/util/net/net_util.h File src/kudu/util/net/net_util.h: Line 96: // (same as network byte order). > add period to end of sentence. Done http://gerrit.cloudera.org:8080/#/c/6514/32/src/kudu/util/net/sockaddr.h File src/kudu/util/net/sockaddr.h: Line 73: // the default auto-generated copy constructor is fine here > What do you think about defining this operator on Network instead of Sockad Done Line 76: }; > I'm not sure it's worth defining this, since it's easily done with std::any Done -- To view, visit http://gerrit.cloudera.org:8080/6514 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6c3fbb5491785874c5701d6c9d866949cfac905e Gerrit-PatchSet: 34 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao HaoGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Harsh J Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] [test-workload] setter for admin operation timeout
Alexey Serbin has uploaded a new change for review. http://gerrit.cloudera.org:8080/6887 Change subject: [test-workload] setter for admin operation timeout .. [test-workload] setter for admin operation timeout Added a setter for client's default admin operation timeout. It sets corresponding property for the aggregated client builder. Change-Id: I3b972c3edcc01b1a3c074958a5f2f0b3f64f608c --- M src/kudu/integration-tests/test_workload.h 1 file changed, 4 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/87/6887/1 -- To view, visit http://gerrit.cloudera.org:8080/6887 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I3b972c3edcc01b1a3c074958a5f2f0b3f64f608c Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey Serbin
[kudu-CR] [rpc] introduce per-RPC credentials policy
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6875 to look at the new patch set (#5). Change subject: [rpc] introduce per-RPC credentials policy .. [rpc] introduce per-RPC credentials policy This patch introduces policy for RPC authentication credentials. The authentication credentials policy allows for control over the type of client-side credentials used for making a remote procedure call. The idea behind this change is simple: sometimes the server's behavior depends on the type of client's credentials used to authenticate the client to the server in the context of the remote procedure call. If the client expects some particular behavior from the server, it has to explicitly specify the type of credentials it wants to use for the call. One example of an RPC depending on the type of the specified credentials is MasterService::ConnectToMaster(). It's impossible to receive an authentication token from the master if calling that method over a connection established with an authn token. To get a new authn token in that case, it's necessary to open a new connection to the master using types of credentials other than authn token (e.g., Kerberos credentials or TLS certificate will work). In other words, derived/secondary authentication credentials (such as authn token) can only be acquired if using the primary ones. That's a crucial restriction to allow for enforcing expiration of derived/secondary credentials. With this patch a client has an ability to re-acquire secondary authentication credentials (authn token) regardless of the type of credentials used to established current connection to Kudu master. As a part of this patch, a new unit test is added to cover the new functionality. Change-Id: I52f806e7b6f6362f66148530124e748e199ae6c2 --- M src/kudu/rpc/connection.cc M src/kudu/rpc/connection.h M src/kudu/rpc/messenger.h M src/kudu/rpc/negotiation.cc M src/kudu/rpc/outbound_call.h M src/kudu/rpc/proxy.cc M src/kudu/rpc/reactor.cc M src/kudu/rpc/reactor.h M src/kudu/rpc/rpc-test-base.h M src/kudu/rpc/rpc-test.cc M src/kudu/rpc/rpc_controller.cc M src/kudu/rpc/rpc_controller.h M src/kudu/rpc/rpc_stub-test.cc 13 files changed, 291 insertions(+), 72 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/6875/5 -- To view, visit http://gerrit.cloudera.org:8080/6875 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I52f806e7b6f6362f66148530124e748e199ae6c2 Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] [c++ client] re-acquire authn token if expired
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6648 to look at the new patch set (#16). Change subject: [c++ client] re-acquire authn token if expired .. [c++ client] re-acquire authn token if expired Updated C++ client to re-acquire authn token if server responds with FATAL_INVALID_AUTHENTICATION_TOKEN error on connection negotiation. If that happens while negotiating a connection for a call, the connection is closed and the client reconnects to the cluster to get a new authn token. After successfully re-acquiring authn token the client retries the RPC call again. Added corresponding integration test to cover authn token re-acquisition for various scenarios. Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5 --- M docs/known_issues.adoc M docs/security.adoc M src/kudu/client/batcher.cc M src/kudu/client/client-internal.cc M src/kudu/client/client-internal.h M src/kudu/client/master_rpc.cc M src/kudu/client/master_rpc.h M src/kudu/client/meta_cache.cc M src/kudu/client/scanner-internal.cc M src/kudu/client/scanner-internal.h M src/kudu/integration-tests/CMakeLists.txt A src/kudu/integration-tests/authn_token_expire-itest.cc M src/kudu/rpc/client_negotiation.cc M src/kudu/rpc/client_negotiation.h M src/kudu/rpc/connection.cc M src/kudu/rpc/connection.h M src/kudu/rpc/negotiation.cc M src/kudu/rpc/outbound_call.cc M src/kudu/rpc/reactor.cc M src/kudu/rpc/reactor.h M src/kudu/rpc/retriable_rpc.h M src/kudu/rpc/rpc.h M src/kudu/rpc/server_negotiation.cc 23 files changed, 755 insertions(+), 143 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/48/6648/16 -- To view, visit http://gerrit.cloudera.org:8080/6648 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5 Gerrit-PatchSet: 16 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] [tests] operations timeout for ClusterVerifier
Alexey Serbin has uploaded a new change for review. http://gerrit.cloudera.org:8080/6888 Change subject: [tests] operations timeout for ClusterVerifier .. [tests] operations timeout for ClusterVerifier Change-Id: I90828d0bd91809fe541782f4e5ded06cbae51873 --- M src/kudu/integration-tests/cluster_verifier.cc M src/kudu/integration-tests/cluster_verifier.h 2 files changed, 21 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/88/6888/1 -- To view, visit http://gerrit.cloudera.org:8080/6888 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I90828d0bd91809fe541782f4e5ded06cbae51873 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey Serbin
[kudu-CR] KUDU-1952 Remove round-robin for block placement
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6636 to look at the new patch set (#27). Change subject: KUDU-1952 Remove round-robin for block placement .. KUDU-1952 Remove round-robin for block placement This is the first of a multi-patch patchset to mitigate the effects of single-disk failure. Throughout the code, the term "DataDir" refers to a data directory, which is often mounted on a distinct disk. Thus, "disks" and "data directories" will be used interchangeably. This patch adds a mapping from tablet to a set of disks and uses it to replace the existing round-robin placement of blocks. Tablets are mapped to a fixed number of disks (i.e. a DataDirGroup). New blocks are placed randomly in directories within each tablet's DataDirGroup. Tablet-to-group mappings are generated and stored as metadata upon tablet creation, or upon tablet replacement during a tablet copy. During group creation, disks are added to groups by randomly selecting two available directories and selecting the one with fewer tablets on it. This avoids pigeonholing new tablets to disks with relatively few tablets, while still trending towards filling underloaded disks. Groups are maintained when restarting the server, as they are flushed with metadata, and are deleted upon tablet deletion. When loading tablet data from a previous version of Kudu, the tablet's metadata will not have a DataDirGroup. One will be generated containing all data directories, as the tablet's data may already be spread across any number of disks. As this patch only addresses block placement, it does not itself mitigate single-disk failure. Given this, and given the tradeoff between I/O and failure disk-failure tolerance, the default behavior will be to spread tablet data across all disks. Testing is done at the block manager level in block_manager-test and log_block_manager-test, as well as in the new data_dirs-test. A design doc can be found here: https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b --- M src/kudu/cfile/bloomfile-test-base.h M src/kudu/cfile/cfile-test-base.h M src/kudu/cfile/cfile-test.cc M src/kudu/fs/CMakeLists.txt M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/block_manager-test.cc M src/kudu/fs/block_manager.h A src/kudu/fs/data_dirs-test.cc M src/kudu/fs/data_dirs.cc M src/kudu/fs/data_dirs.h M src/kudu/fs/file_block_manager.cc M src/kudu/fs/file_block_manager.h M src/kudu/fs/fs.proto M src/kudu/fs/fs_manager-test.cc M src/kudu/fs/fs_manager.cc M src/kudu/fs/fs_manager.h M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/tablet/delta_compaction-test.cc M src/kudu/tablet/delta_compaction.cc M src/kudu/tablet/delta_compaction.h M src/kudu/tablet/delta_tracker.cc M src/kudu/tablet/deltafile-test.cc M src/kudu/tablet/deltamemstore-test.cc M src/kudu/tablet/diskrowset.cc M src/kudu/tablet/metadata.proto M src/kudu/tablet/multi_column_writer.cc M src/kudu/tablet/multi_column_writer.h M src/kudu/tablet/tablet_bootstrap-test.cc M src/kudu/tablet/tablet_metadata.cc M src/kudu/tablet/tablet_metadata.h M src/kudu/tools/kudu-tool-test.cc M src/kudu/tserver/tablet_copy_client-test.cc M src/kudu/tserver/tablet_copy_client.cc M src/kudu/tserver/tablet_copy_client.h M src/kudu/tserver/ts_tablet_manager.cc M src/kudu/tserver/ts_tablet_manager.h 38 files changed, 987 insertions(+), 190 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/6636/27 -- To view, visit http://gerrit.cloudera.org:8080/6636 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9828147f4fa5c4d7f6ed23441dca5a116b8cb11b Gerrit-PatchSet: 27 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew WongGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon