Alexey Serbin has posted comments on this change. Change subject: [rpc] handling ERROR_UNAVAILABLE RPC error ......................................................................
Patch Set 6: (5 comments) It seems that for the better trust in the test coverage I need to add one more scenario which would replace the authn token in the client context in the background (at random times) and after some time import that key into the master as well, while running some create-table-write-then-read-the-written-data-drop-the-table workload. I'll address your comments and add that scenario as well in the next version of this patch. http://gerrit.cloudera.org:8080/#/c/6640/6//COMMIT_MSG Commit Message: PS6, Line 16: SERVICE_UNAVAILABLE_TRY_LATER. > SERVICE_UNAVAILABLE Done http://gerrit.cloudera.org:8080/#/c/6640/6/src/kudu/integration-tests/security-unknown-tsk-itest.cc File src/kudu/integration-tests/security-unknown-tsk-itest.cc: Line 96: FLAGS_rpc_default_keepalive_time_ms = kConnectionKeepaliveMs_; > Doesn't this need to be set on the server? As I understand, setting this flag here does set that parameter on the server side as well since the MiniCluster is used here (not ExternalMiniCluster). I'll add a comment. Line 175: for (int i = 0; i < num_tablet_servers_; ++i) { > Why did you add multiple tablets? Is there something different about the m It seems I forgot to comment on this in the code -- will do. The idea was to have a scenario where client make calls to both the master and tablet servers. As I understand, the client populates its meta-cache with the information on tablet servers it needs to contact to perform the specified operation, not all the tablets for the specified table. In the beginning of the test the client makes a single insert into the first tablet with range [-inf, 0) and stores the info about that. Doing so allows to send write requests to the tablet server (not the the master) later on while trying to insert the same value. From the other side, it's necessary to exercise client-->master calls while populating the meta-cache in the context of inserting a value in the other, non-yet-discovered tablet. BTW, it seems I missed one piece there as well: trying to insert into another tablet while having the original token replaced. I will fix that. Line 261: // The client automatically retries on getting ServiceUnavailable from the > It looks like this retry code may be write-path specific. There is coverag Yep, there is a scan attempt as well, however it covers the client->master path. I need to update it to cover the client->tablet server path as well. Line 279: atomic<bool> importer_do_run(true); > This is pretty confusing. It seems like this bool won't get flipped till t That's mostly for failure scenarios when something goes wrong in the code below (like Apply() returns an error). I'll add a comment. -- To view, visit http://gerrit.cloudera.org:8080/6640 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I87d780a4ad88c15ceaacfddf6c1b69ed053bb959 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey Serbin <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Dan Burkert <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
