[ https://issues.apache.org/jira/browse/THRIFT-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15582535#comment-15582535 ]
ASF GitHub Bot commented on THRIFT-3932: ---------------------------------------- Github user jeking3 commented on the issue: https://github.com/apache/thrift/pull/1103 What you described is a failing in the original set of thread factories to provide common functionality (detached, policy, priority) in the base class. Only PosixThreadFactory has policy and priority, and yet it's quite likely that all of them would benefit from these. I moved the detached information into the base class as a first step. I could certainly add back setDetached into the base and mark it deprecated, however that breaks the contract that the servers require - that the detached disposition cannot change while the server is running, so it really doesn't belong... I still have no valid use case for changing detached disposition of a thread manager while it is in use. Now that said, I believe that Thread knows if it is detached or not; if Thread had a "joinIfJoinable()" method then that server requirement would go away since the TConnectedClient could just call that and it would do the right thing, and setDetached() could be added back in avoiding that particular breaking change. I don't find breaking changes to be a problem when they remove behavior that has no use case to support it, however. There's a lot of technical debt in the concurrency area that still could be addressed - this was a first pass as trying to shore it up and ensure functions provided are actually tested and work, and that tests don't take 100 seconds each. > C++ ThreadManager has a rare termination race > --------------------------------------------- > > Key: THRIFT-3932 > URL: https://issues.apache.org/jira/browse/THRIFT-3932 > Project: Thrift > Issue Type: Bug > Components: C++ - Library > Reporter: Buğra Gedik > Assignee: James E. King, III > Fix For: 0.11.0 > > Attachments: thrift-patch > > Time Spent: 17h > Remaining Estimate: 0h > > {{ThreadManger::join}} calls {{stopImpl(true)}}, which in turn calls > {{removeWorker(workerCount_);}}. The latter waits until {{while (workerCount_ > != workerMaxCount_)}}. Within the {{run}} method of the workers, the last > thread that detects {{workerCount_ == workerMaxCount_}} notifies > {{removeWorker}}. The {{run}} method has the following additional code that > is executed at the very end: > {code} > { > Synchronized s(manager_->workerMonitor_); > manager_->deadWorkers_.insert(this->thread()); > if (notifyManager) { > manager_->workerMonitor_.notify(); > } > } > {code} > This is an independent synchronized block. Now assume 2 threads. One of them > has {{notifyManager=true}} as it detected the {{workerCount_ == > workerMaxCount_}} condition earlier. It is possible that this thread gets to > execute the above code block first, {{ThreadManager}}'s {{removeWorker}} > method unblocks, and eventually {{ThreadManager}}'s {{join}} returns and the > object is destructed. When the other thread reaches the synchronized block > above, it will crash, as the manager is not around anymore. > Besides, {{ThreadManager}} never joins its threads. > Attached is a patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)