Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9551 )

Change subject: KUDU-2289: Tablet deletion should be throttled
......................................................................


Patch Set 3:

(6 comments)

Do you have any advice on how to test this? I'm not setting a queue length, the 
idea being that DeleteTablets can wait in the queue instead of making the 
master constantly retry them, so the DeleteTablet RPCs will no be throttled. I 
can reproduce symptoms of the original problem in 
CreateTableStressTest.CreateAndDeleteBigTable by turning up the number of 
tablets in the big table-- there'll be service-queue-full rejections of 
CreateTablet and DeleteTablet calls in the logs-- and those go away for 
DeleteTablets when this change is applied. DeleteTablet is an admin service 
RPC, so I can't, for example, Ping the service to see if the queue is full.

http://gerrit.cloudera.org:8080/#/c/9551/3/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/9551/3/src/kudu/tserver/tablet_service.cc@807
PS3, Line 807: context, req, resp
> I think they're guaranteed to live until the call to context->RespondSucces
Thanks Adar. Here's a relevant snippet from the comment on 
RpcContext::RespondSuccess() in rpc_context.h:

 // After this method returns, this RpcContext object is destroyed. The request
 // and response protobufs are also destroyed.

Likewise, the context, request, and response objects are also destroyed when 
calling the other RpcContext::Respond* methods.


http://gerrit.cloudera.org:8080/#/c/9551/3/src/kudu/tserver/ts_tablet_manager.h
File src/kudu/tserver/ts_tablet_manager.h:

http://gerrit.cloudera.org:8080/#/c/9551/3/src/kudu/tserver/ts_tablet_manager.h@142
PS3, Line 142:   Status DeleteTablet(const std::string& tablet_id,
> Could we make the synchronous version private? AFAICT it's now only used in
DeleteTabletRunnable calls DeleteTablet. I could make DeleteTabletRunnable a 
friend of TSTabletManager and make DeleteTablet private, but that feels like 
overkill because DeleteTabletRunnable just needs access to one method: a 
synchronous DeleteTablet().


http://gerrit.cloudera.org:8080/#/c/9551/3/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

http://gerrit.cloudera.org:8080/#/c/9551/3/src/kudu/tserver/ts_tablet_manager.cc@278
PS3, Line 278:   virtual void DisableCallback() {
> This doesn't need to be virtual, does it?
Done


http://gerrit.cloudera.org:8080/#/c/9551/3/src/kudu/tserver/ts_tablet_manager.cc@321
PS3, Line 321: max_open_threads
> This should be max_delete_threads.
Done


http://gerrit.cloudera.org:8080/#/c/9551/3/src/kudu/tserver/ts_tablet_manager.cc@840
PS3, Line 840:       new DeleteTabletRunnable(this, tablet_id, delete_type, 
cas_config_index, cb));
> Nit: use std::make_shared to make the runnable.
Done


http://gerrit.cloudera.org:8080/#/c/9551/3/src/kudu/tserver/ts_tablet_manager.cc@848
PS3, Line 848:   if (s.IsServiceUnavailable()) {
             :     cb(s, TabletServerErrorPB::THROTTLED);
             :     return;
             :   }
             :   cb(s, TabletServerErrorPB::UNKNOWN_ERROR);
> Simpler as:
Done



--
To view, visit http://gerrit.cloudera.org:8080/9551
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3819bf8a3acf8ea03a76cc6cacd92d85bb114998
Gerrit-Change-Number: 9551
Gerrit-PatchSet: 3
Gerrit-Owner: Will Berkeley <wdberke...@gmail.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com>
Gerrit-Comment-Date: Mon, 19 Mar 2018 19:00:10 +0000
Gerrit-HasComments: Yes

Reply via email to