Todd Lipcon has posted comments on this change. Change subject: [c++ client] re-acquire authn token if expired ......................................................................
Patch Set 2: (13 comments) http://gerrit.cloudera.org:8080/#/c/6648/2/src/kudu/integration-tests/authn_token_expire-itest.cc File src/kudu/integration-tests/authn_token_expire-itest.cc: Line 80: token_validity_seconds_(2), is this going to be flaky in TSAN? or is it OK because the new behavior will get it to retry until it happens to run fast enough? PS2, Line 78: : num_tablet_servers_(2), : heartbeat_interval_ms_(10), : token_validity_seconds_(2), : tsk_rotation_seconds_(1), : key_column_name_("key"), : key_split_value_(8) { why are all of these members instead of just constants? Line 88: //FLAGS_scanner_gc_check_interval_us = 50 * 1000; ? Line 98: CHECK_OK(b.Build(&schema_)); if you don't care about the schema in particular, can you use one of the existing ones like CreateKeyValueTestSchema? PS2, Line 166: size_t prefer signed type (int) PS2, Line 177: NO typo PS2, Line 213: // Since the authn token is verified upon connection establishment, it's : // necessary to make sure the client does not keep the connections to : // corresponding tablet servers open. So, let's restart the tablet servers : // since the RPC layer might keep already established connections open. we could set the rpc keepalive timer to short, instead? Line 229: //ASSERT_OK(scanner.SetFaultTolerant()); ? Line 230: ASSERT_OK(scanner.Open()); this is only really testing the case of the token being expired on the beginning of a Scan and not in the middle of one. How can we get coverage of the case where it expires mid-scan (ie when we are transitioning to a new tablet or continuing an existing scanner)? similarly I'm not sure I trust that we've covered all of the various master lookup paths (eg expired token when trying to lookup tablet locations). I'm wondering whether there is any randomized test we can do that will encourage more of these potential overlaps of expiration with various client ops. One idea would be to explicitly inject false "expired". eg start all servers with a flag --inject_token_expiration_errors=0.2 which says that the server should respond with a token expiration error on 20% of RPCs. Or perhaps inject in the RPC client itself if that's more convenient. Then we can be pretty sure that we'll get coverage of lots of different cases without being careful about maneuvering timings to expire exactly at certain points. PS2, Line 230: ASSERT_OK(scanner.Open()); : size_t row_count = 0; : while (scanner.HasMoreRows()) { : vector<KuduRowResult> rows; : ASSERT_OK(scanner.NextBatch(&rows)); : row_count += rows.size(); : } : EXPECT_EQ(2, row_count); I think client-test-util has some code to CountRowsFromScanner or something http://gerrit.cloudera.org:8080/#/c/6648/2/src/kudu/rpc/connection.cc File src/kudu/rpc/connection.cc: Line 159: c->call->SetFailed(status, rpc_error.release()); wouldn't this only send the RPC error to the first of the awaiting calls? It's possible with multiple threads in the client that you'd have many pending RPC calls on a single connection while it's in the process of negotiation. http://gerrit.cloudera.org:8080/#/c/6648/2/src/kudu/rpc/retriable_rpc.h File src/kudu/rpc/retriable_rpc.h: PS2, Line 71: of or PS2, Line 92: autentication typo -- To view, visit http://gerrit.cloudera.org:8080/6648 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I418497ac59cfd4e476e9bfc6abe6b10b487712f5 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey Serbin <[email protected]> Gerrit-Reviewer: Dan Burkert <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
