Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/12112 )
Change subject: [util] use lighter locking scheme for ThreadMgr ...................................................................... Patch Set 1: (4 comments) Thank you for the review. I updated the webserver-stress-itest to make it cover this piece of functionality, but so far it passes. However, my attempts to expose the races if removing the locks at some places are not yet successful. I'll post an update for the test when I have repro scenarios for anticipated TSAN races. http://gerrit.cloudera.org:8080/#/c/12112/1/src/kudu/util/thread.cc File src/kudu/util/thread.cc: http://gerrit.cloudera.org:8080/#/c/12112/1/src/kudu/util/thread.cc@165 PS1, Line 165: std::lock_guard<decltype(lock_)> l(lock_); > Is this actually necessary? No other thread should have access to ThreadMgr I also was curious regarding this lock, and decided to keep it because I suspected there were some TSAN-related complications if removing it :) OK, let's remove this and see -- if something surfaces from this place, I'll fix it in a follow-up changelist. http://gerrit.cloudera.org:8080/#/c/12112/1/src/kudu/util/thread.cc@263 PS1, Line 263: std::lock_guard<decltype(lock_)> l(lock_); > Why do we need this acquisition? Good catch. It seems this is not needed: nothing that should be guarded accessed directly in this method and also this method is being executed under std::once's implicit guard. http://gerrit.cloudera.org:8080/#/c/12112/1/src/kudu/util/thread.cc@346 PS1, Line 346: void ThreadMgr::PrintThreadDescriptorRow(const ThreadDescriptor& desc, > DCHECK that lock_ isn't held? I thought that would be a good idea and added DCHECK(!lock_.is_locked()) here, but when running tests with multiple concurrent thread it failed: it seems that due to the concurrency that check triggered. I tried to find a way to check whether this particular thread holds the lock, but I didn't find that. I also looked at AssertWaitAllowed/AssertIOAllowed, but I'm not sure it possible to use those as we would want. Please let me know if I'm missing something here. http://gerrit.cloudera.org:8080/#/c/12112/1/src/kudu/util/thread.cc@402 PS1, Line 402: "<h4>" << threads_running_metric_ << " thread(s) running" > Doesn't this access need to be protected by lock_? Ah, good catch: I need to move the shared_lock<> a few lines above. Thanks, I'll add the test. -- To view, visit http://gerrit.cloudera.org:8080/12112 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4d49c1c39392e01c45019844430a4fe3d116c277 Gerrit-Change-Number: 12112 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Will Berkeley <[email protected]> Gerrit-Comment-Date: Thu, 20 Dec 2018 02:31:43 +0000 Gerrit-HasComments: Yes
