Todd Lipcon has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8616 )

Change subject: maintenance_manager: fix a deadlock on shutdown
......................................................................

maintenance_manager: fix a deadlock on shutdown

The shutdown sequence of the tablet server first shuts down the maintenance
manager and then calls Unregister() on the registered ops.

This produced a potential hang on shutdown, since the 'Shutdown()' call could
run at the same time that some maintenance ops were waiting on the thread_pool_
queue. Those waiting functions would be removed from the queue silently. We
depend on the functions running to decrement the 'running_' count of the 
associated
op, so when they were removed silently, the 'Unregister()' call could block 
forever
waiting for the 'running_' count to go to 0.

This caused a timeout of about 0.5% of runs of the new stop-tablet-itest
'TestShutdownWhileWriting' test case. With this fix, no runs time out.

Change-Id: Icaf864299bfd43212bc9655f48128851b9c1d59b
Reviewed-on: http://gerrit.cloudera.org:8080/8616
Tested-by: Kudu Jenkins
Reviewed-by: Mike Percy <[email protected]>
---
M src/kudu/util/maintenance_manager.cc
1 file changed, 5 insertions(+), 0 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Mike Percy: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/8616
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Icaf864299bfd43212bc9655f48128851b9c1d59b
Gerrit-Change-Number: 8616
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>

Reply via email to