Hello Mike Percy, Andrew Wong,

I'd like you to do a code review. Please visit

    http://gerrit.cloudera.org:8080/8616

to review the following change.


Change subject: maintenance_manager: fix a deadlock on shutdown
......................................................................

maintenance_manager: fix a deadlock on shutdown

The shutdown sequence of the tablet server first shuts down the maintenance
manager and then calls Unregister() on the registered ops.

This produced a potential hang on shutdown, since the 'Shutdown()' call could
run at the same time that some maintenance ops were waiting on the thread_pool_
queue. Those waiting functions would be removed from the queue silently. We
depend on the functions running to decrement the 'running_' count of the 
associated
op, so when they were removed silently, the 'Unregister()' call could block 
forever
waiting for the 'running_' count to go to 0.

This caused a timeout of about 0.5% of runs of the new stop-tablet-itest
'TestShutdownWhileWriting' test case. With this fix, no runs time out.

Change-Id: Icaf864299bfd43212bc9655f48128851b9c1d59b
---
M src/kudu/util/maintenance_manager.cc
1 file changed, 5 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/16/8616/1
--
To view, visit http://gerrit.cloudera.org:8080/8616
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Icaf864299bfd43212bc9655f48128851b9c1d59b
Gerrit-Change-Number: 8616
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Mike Percy <[email protected]>

Reply via email to