Todd Lipcon has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8616 )
Change subject: maintenance_manager: fix a deadlock on shutdown ...................................................................... maintenance_manager: fix a deadlock on shutdown The shutdown sequence of the tablet server first shuts down the maintenance manager and then calls Unregister() on the registered ops. This produced a potential hang on shutdown, since the 'Shutdown()' call could run at the same time that some maintenance ops were waiting on the thread_pool_ queue. Those waiting functions would be removed from the queue silently. We depend on the functions running to decrement the 'running_' count of the associated op, so when they were removed silently, the 'Unregister()' call could block forever waiting for the 'running_' count to go to 0. This caused a timeout of about 0.5% of runs of the new stop-tablet-itest 'TestShutdownWhileWriting' test case. With this fix, no runs time out. Change-Id: Icaf864299bfd43212bc9655f48128851b9c1d59b Reviewed-on: http://gerrit.cloudera.org:8080/8616 Tested-by: Kudu Jenkins Reviewed-by: Mike Percy <[email protected]> --- M src/kudu/util/maintenance_manager.cc 1 file changed, 5 insertions(+), 0 deletions(-) Approvals: Kudu Jenkins: Verified Mike Percy: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/8616 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Icaf864299bfd43212bc9655f48128851b9c1d59b Gerrit-Change-Number: 8616 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]>
