[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-29 Thread tpcwang
Github user tpcwang commented on the pull request:

https://github.com/apache/thrift/pull/980#issuecomment-215848959
  
LGTM. Thanks for fixing this issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-29 Thread jeking3
Github user jeking3 commented on a diff in the pull request:

https://github.com/apache/thrift/pull/980#discussion_r61612686
  
--- Diff: lib/cpp/src/thrift/concurrency/ThreadManager.cpp ---
@@ -421,7 +416,7 @@ void ThreadManager::Impl::removeWorker(size_t value) {
   {
 Synchronized s(workerMonitor_);
 
-while (workerCount_ != workerMaxCount_) {
+while (workerCount_ > goalCount) {
--- End diff --

I reverted all changes to this file.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-29 Thread tpcwang
Github user tpcwang commented on a diff in the pull request:

https://github.com/apache/thrift/pull/980#discussion_r61601611
  
--- Diff: lib/cpp/src/thrift/concurrency/ThreadManager.cpp ---
@@ -421,7 +416,7 @@ void ThreadManager::Impl::removeWorker(size_t value) {
   {
 Synchronized s(workerMonitor_);
 
-while (workerCount_ != workerMaxCount_) {
+while (workerCount_ > goalCount) {
--- End diff --

I had missed that it is using different monitors and thus different 
mutexes. Thanks for clarifying.

I would still argue that this new implementation is worse than the existing 
implementation. With the existing implementation, concurrent **writes** to 
`workerMaxCount_` and `workerCount_` is synchronized, but the reading of those 
values is not synchronized. Reading an aligned integer is most likely atomic on 
most architectures, so this might not even be that unsafe (?). Getting either 
the old value or the new value will not invalidate what it is trying to do.

On the other hand, the new implementation has a problem with concurrent 
removeWorker and addWorker as far as I can tell. A removeWorker can be stuck 
forever if a addWorker call comes in after the removeWorker sets the local 
`goalCount` because it will never hit that goal with new workers that are 
produced by addWorker.

Hopefully I didn't miss another detail. Thanks for working with me to 
address these concerns.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-29 Thread jeking3
Github user jeking3 commented on a diff in the pull request:

https://github.com/apache/thrift/pull/980#discussion_r61589086
  
--- Diff: lib/cpp/src/thrift/concurrency/ThreadManager.cpp ---
@@ -421,7 +416,7 @@ void ThreadManager::Impl::removeWorker(size_t value) {
   {
 Synchronized s(workerMonitor_);
 
-while (workerCount_ != workerMaxCount_) {
+while (workerCount_ > goalCount) {
--- End diff --

Old line 399 modifies it under the monitor ``monitor_`` and then old line 
424 acquired it under a different monitor ``workerMonitor_``.  One cannot use 
different monitors to control access to the same variable and expect any 
synchronization.  This is why I squirrel away the new value into goalCount and 
use it later on.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-29 Thread tpcwang
Github user tpcwang commented on a diff in the pull request:

https://github.com/apache/thrift/pull/980#discussion_r61581424
  
--- Diff: lib/cpp/src/thrift/concurrency/ThreadManager.cpp ---
@@ -421,7 +416,7 @@ void ThreadManager::Impl::removeWorker(size_t value) {
   {
 Synchronized s(workerMonitor_);
 
-while (workerCount_ != workerMaxCount_) {
+while (workerCount_ > goalCount) {
--- End diff --

It is not accessing `workerMaxCount_` outside of a lock. See line 424 where 
it uses the `Synchronized` object which grabs the mutex that the monitor owns. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-29 Thread jeking3
Github user jeking3 commented on a diff in the pull request:

https://github.com/apache/thrift/pull/980#discussion_r61558381
  
--- Diff: lib/cpp/src/thrift/concurrency/ThreadManager.cpp ---
@@ -421,7 +416,7 @@ void ThreadManager::Impl::removeWorker(size_t value) {
   {
 Synchronized s(workerMonitor_);
 
-while (workerCount_ != workerMaxCount_) {
+while (workerCount_ > goalCount) {
--- End diff --

Old line 424 was accessing ``workerMaxCount_`` outside of a lock so its 
behavior was unpredictable when multiple threads called ``removeWorker()`` 
simultaneously.  Saving the value ti ``goalCount`` removes this 
unpredictability and therefore is an improvement.  Also see THRIFT-3233 for 
another fix submitted by someone else.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-28 Thread tpcwang
Github user tpcwang commented on a diff in the pull request:

https://github.com/apache/thrift/pull/980#discussion_r61534321
  
--- Diff: lib/cpp/src/thrift/concurrency/ThreadManager.cpp ---
@@ -421,7 +416,7 @@ void ThreadManager::Impl::removeWorker(size_t value) {
   {
 Synchronized s(workerMonitor_);
 
-while (workerCount_ != workerMaxCount_) {
+while (workerCount_ > goalCount) {
--- End diff --

I would be happy to give my vote for this to be merged in, if you can 
address my concern on the change in this file. I still don't think this new 
implementation correctly handles concurrent addWorker/removeWorker calls. I 
also think the existing implementation actually handles concurrent removeWorker 
calls correctly, so it is not clear to me why this change is necessary. Sorry 
if I miss anything obvious.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-28 Thread jeking3
Github user jeking3 commented on the pull request:

https://github.com/apache/thrift/pull/980#issuecomment-215502049
  
Looking for an advocate who would consider merging this fix despite the 
busted Apache Jenkins environment result.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-25 Thread jeking3
Github user jeking3 commented on the pull request:

https://github.com/apache/thrift/pull/980#issuecomment-214391825
  
Looking for an advocate to consider merging this.  It passed all three 
builds with the exception of something that happened on the apache jenkins 
where a cppcheck file was not created:

> [Cppcheck] Starting the cppcheck analysis.
> [Cppcheck] No cppcheck test report file(s) were found with the pattern 
'cppcheck-result.xml' relative to 
'/home/jenkins/jenkins-slave/workspace/Thrift-precommit'.  Did you enter a 
pattern relative to the correct directory?  Did you generate the XML report(s) 
for Cppcheck?
> [Cppcheck] Parsing throws exceptions. No cppcheck test report file(s) 
were found with the pattern 'cppcheck-result.xml' relative to 
'/home/jenkins/jenkins-slave/workspace/Thrift-precommit'.  Did you enter a 
pattern relative to the correct directory?  Did you generate the XML report(s) 
for Cppcheck?
> Build step 'Publish Cppcheck results' changed build result to FAILURE



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-21 Thread tpcwang
Github user tpcwang commented on a diff in the pull request:

https://github.com/apache/thrift/pull/980#discussion_r60614631
  
--- Diff: lib/cpp/src/thrift/concurrency/ThreadManager.cpp ---
@@ -421,7 +416,7 @@ void ThreadManager::Impl::removeWorker(size_t value) {
   {
 Synchronized s(workerMonitor_);
 
-while (workerCount_ != workerMaxCount_) {
+while (workerCount_ > goalCount) {
--- End diff --

In that particular scenario, the second removeWorker should fail since we 
should be synchronizing correctly when it checks if (value > workerMaxCount_), 
but I agree that ThreadManager probably needs some TLC.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-21 Thread jeking3
Github user jeking3 commented on a diff in the pull request:

https://github.com/apache/thrift/pull/980#discussion_r60614134
  
--- Diff: lib/cpp/src/thrift/concurrency/ThreadManager.cpp ---
@@ -421,7 +416,7 @@ void ThreadManager::Impl::removeWorker(size_t value) {
   {
 Synchronized s(workerMonitor_);
 
-while (workerCount_ != workerMaxCount_) {
+while (workerCount_ > goalCount) {
--- End diff --

ThreadManager should probably be tossed out; removeWorker is inherently 
dangerous for a variety of reasons just like this.  In particular I can see a 
scenario where multiple workers simultaneously try to reduce the worker count 
by the same amount and fail badly because they both thought there were 5 
workers and they both asked to removeWorker(3).  The API would be much better 
as specifying what the new limit should be and the reduction of intention vs 
setting a barrier until you get there should be separate operations.  It seems 
pretty fragile as-is and I think the whole thing needs a rethink.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-21 Thread tpcwang
Github user tpcwang commented on a diff in the pull request:

https://github.com/apache/thrift/pull/980#discussion_r60612338
  
--- Diff: lib/cpp/src/thrift/concurrency/ThreadManager.cpp ---
@@ -421,7 +416,7 @@ void ThreadManager::Impl::removeWorker(size_t value) {
   {
 Synchronized s(workerMonitor_);
 
-while (workerCount_ != workerMaxCount_) {
+while (workerCount_ > goalCount) {
--- End diff --

I'm not sure I understand what the problem is. Why would calling 
removeWorker() at the same time clobber each other's desire goal? They have a 
shared goal of reducing workerMaxCount_ and both will eventually get there, 
right?

However, I think the new implementation does have problems with addWorker 
and removeWorker. Say I start with 1 worker:
(1) Thread1 calls removeWorker(1) - goalCount is 0, workerCount_ is 1
(2) Thread2 calls addWorker(2) - goalCount is 0, workerCount_ is 3
(3) Thread3 worker is reaped - goalCount is 0, workerCount_ is 2 <- this 
will stay at 2 forever

In any case, we may want to discuss this in a separate change if there is a 
problem with the current implementation since this pull request no longer uses 
the thread manager.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-21 Thread jeking3
Github user jeking3 commented on a diff in the pull request:

https://github.com/apache/thrift/pull/980#discussion_r60591056
  
--- Diff: lib/cpp/src/thrift/server/TThreadedServer.cpp ---
@@ -90,31 +95,71 @@ TThreadedServer::~TThreadedServer() {
 }
 
 void TThreadedServer::serve() {
+  threadFactory_->setDetached(false);
--- End diff --

I don't see an easy way around this.  The threaded server must be able to 
join.  The caller could pass a non-detached ThreadManager into the ctor and 
then setDetached(true) before or while serve() is running - there are a variety 
of things we cannot control.  The default behavior is to use a detached 
ThreadManager, so if I changed this to a check with an invalid_argument or 
logic_error exception, there would have been more work to do in the default 
simple thread manager static factory and related calls.

On line 38 of the header this behavior is documented, at least.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-21 Thread jeking3
Github user jeking3 commented on a diff in the pull request:

https://github.com/apache/thrift/pull/980#discussion_r60590542
  
--- Diff: lib/cpp/src/thrift/concurrency/ThreadManager.cpp ---
@@ -421,7 +416,7 @@ void ThreadManager::Impl::removeWorker(size_t value) {
   {
 Synchronized s(workerMonitor_);
 
-while (workerCount_ != workerMaxCount_) {
+while (workerCount_ > goalCount) {
--- End diff --

This routine blocks until there are fewer actual workers than the number of 
workers when it was called less the value passed in.  On a busy system where 
workers are being added this means it might be in here for a while.  I did not 
change this behavior; simply I ensured that the effects of multiple threads 
calling removeWorker() at the same time don't clobber each-other's desired goal 
of how many to remove.

It would be better to have two calls; one to trim the maximum number of 
workers but it doesn't block; another being a barrier you can use to get to 
that number.  Callers may want to trim without blocking, for example.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-19 Thread jeking3
Github user jeking3 commented on a diff in the pull request:

https://github.com/apache/thrift/pull/980#discussion_r60227644
  
--- Diff: lib/cpp/src/thrift/server/TThreadedServer.cpp ---
@@ -90,31 +95,69 @@ TThreadedServer::~TThreadedServer() {
 }
 
 void TThreadedServer::serve() {
+  threadFactory_->setDetached(false);
   TServerFramework::serve();
 
-  // Drain all clients - no more will arrive
-  try {
-Synchronized s(clientsMonitor_);
-while (getConcurrentClientCount() > 0) {
-  clientsMonitor_.wait();
-}
-  } catch (TException& tx) {
-string errStr = string("TThreadedServer: Exception joining workers: ") 
+ tx.what();
-GlobalOutput(errStr.c_str());
+  // Ensure post-condition of no active clients
+  Synchronized s(clientMonitor_);
+  while (!activeClientMap_.empty()) {
+clientMonitor_.wait();
+  }
+
+  drainDeadClients();
+}
+
+void TThreadedServer::drainDeadClients() {
+  // we're in a monitor here
+  while (!deadClientMap_.empty()) {
+ClientMap::iterator it = deadClientMap_.begin();
+it->second->join();
+deadClientMap_.erase(it);
   }
 }
 
 void TThreadedServer::onClientConnected(const 
shared_ptr& pClient) {
-  threadFactory_->newThread(pClient)->start();
+  Synchronized sync(clientMonitor_);
+  activeClientMap_.insert(ClientMap::value_type(pClient.get(), 
boost::make_shared(pClient)));
+  ClientMap::iterator it = activeClientMap_.find(pClient.get());
--- End diff --

I implemented this suggestion.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-17 Thread tpcwang
Github user tpcwang commented on a diff in the pull request:

https://github.com/apache/thrift/pull/980#discussion_r59996235
  
--- Diff: lib/cpp/test/TServerIntegrationTest.cpp ---
@@ -152,7 +155,10 @@ class TServerIntegrationTestFixture {
   new TServerSocket("localhost", 0)),
   boost::shared_ptr(new 
TTransportFactory),
   boost::shared_ptr(new 
TBinaryProtocolFactory))),
-  pEventHandler(boost::shared_ptr(new 
TServerReadyEventHandler)) {
+  pEventHandler(boost::shared_ptr(new 
TServerReadyEventHandler)),
+ bStressDone(false),
--- End diff --

Looks like the tabs are not aligned correctly in this document.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-17 Thread tpcwang
Github user tpcwang commented on the pull request:

https://github.com/apache/thrift/pull/980#issuecomment-211137935
  
I think this change looks good for the most part, maybe we can get one of 
the maintainers to look at it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-17 Thread tpcwang
Github user tpcwang commented on a diff in the pull request:

https://github.com/apache/thrift/pull/980#discussion_r59996087
  
--- Diff: lib/cpp/src/thrift/server/TThreadedServer.cpp ---
@@ -90,31 +95,71 @@ TThreadedServer::~TThreadedServer() {
 }
 
 void TThreadedServer::serve() {
+  threadFactory_->setDetached(false);
--- End diff --

I am *slightly* concerned about user reusing the thrift factory in other 
part of the code, so changing this can change the function of their other code. 
I'm not too sure what the maintainers usually do when there are some minor 
functional changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-17 Thread tpcwang
Github user tpcwang commented on a diff in the pull request:

https://github.com/apache/thrift/pull/980#discussion_r59995963
  
--- Diff: lib/cpp/src/thrift/concurrency/ThreadManager.cpp ---
@@ -421,7 +416,7 @@ void ThreadManager::Impl::removeWorker(size_t value) {
   {
 Synchronized s(workerMonitor_);
 
-while (workerCount_ != workerMaxCount_) {
+while (workerCount_ > goalCount) {
--- End diff --

What if an addWorker call was made in between this and when goalCount was 
set? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-12 Thread jeking3
Github user jeking3 commented on the pull request:

https://github.com/apache/thrift/pull/980#issuecomment-209039437
  
@jfarrell Could you look into why the Jenkins build for apache thinks that 
a c_glib file needs to be merged manually and/or has an open pull request on 
the apache build system?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-12 Thread jeking3
Github user jeking3 commented on a diff in the pull request:

https://github.com/apache/thrift/pull/980#discussion_r59410629
  
--- Diff: lib/cpp/src/thrift/server/TThreadedServer.h ---
@@ -83,12 +88,56 @@ class TThreadedServer : public TServerFramework {
   virtual void serve();
 
 protected:
+  /**
+   * Drain recently connected clients by joining their threads - this is 
done lazily because
+   * we cannot do it inside the thread context that is disconnecting.
+   */
+  virtual void drainDeadClients();
+
+  /**
+   * Implementation of TServerFramework::onClientConnected
+   */
   virtual void onClientConnected(const 
boost::shared_ptr& pClient) /* override */;
-  virtual void onClientDisconnected(TConnectedClient* pClient) /* override 
*/;
+
+  /**
+   * Implementation of TServerFramework::onClientDisconnected
+   */
+  virtual void onClientDisconnected(TConnectedClient *pClient) /* override 
*/;
 
   boost::shared_ptr 
threadFactory_;
-  apache::thrift::concurrency::Monitor clientsMonitor_;
+
+  /**
+   * A helper wrapper used to wrap the client in something we can use to 
maintain
+   * the lifetime of the connected client within a detached thread.
--- End diff --

This busted StressTest and StressTestNonBlocking because they are coded to 
assume that the thread shared pointer will hold on to the runnable after the 
thread is gone.  The TConnectedClientTracker is therefore necessary to avoid 
breaking existing clients that might depend on this behavior of Thread, so I am 
not going to simplify to a map of TConnectedClient* to shared_ptr.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-12 Thread jeking3
Github user jeking3 commented on a diff in the pull request:

https://github.com/apache/thrift/pull/980#discussion_r59390673
  
--- Diff: lib/cpp/src/thrift/server/TThreadedServer.h ---
@@ -83,12 +88,56 @@ class TThreadedServer : public TServerFramework {
   virtual void serve();
 
 protected:
+  /**
+   * Drain recently connected clients by joining their threads - this is 
done lazily because
+   * we cannot do it inside the thread context that is disconnecting.
+   */
+  virtual void drainDeadClients();
+
+  /**
+   * Implementation of TServerFramework::onClientConnected
+   */
   virtual void onClientConnected(const 
boost::shared_ptr& pClient) /* override */;
-  virtual void onClientDisconnected(TConnectedClient* pClient) /* override 
*/;
+
+  /**
+   * Implementation of TServerFramework::onClientDisconnected
+   */
+  virtual void onClientDisconnected(TConnectedClient *pClient) /* override 
*/;
 
   boost::shared_ptr 
threadFactory_;
-  apache::thrift::concurrency::Monitor clientsMonitor_;
+
+  /**
+   * A helper wrapper used to wrap the client in something we can use to 
maintain
+   * the lifetime of the connected client within a detached thread.
--- End diff --

I made this change and it will only work if I modify each of the thread 
implementations to release the smart_ptr to runnable in the thread class after 
the thread runs.  I assume this is acceptable; once the runnable runs we don't 
need the thread smart_ptr holding on to it any more; TServerFramework requires 
the TConnectedClient (the runnable) to be destroyed when the client disconnects 
to function properly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-12 Thread jeking3
Github user jeking3 commented on a diff in the pull request:

https://github.com/apache/thrift/pull/980#discussion_r59390310
  
--- Diff: lib/cpp/src/thrift/server/TThreadedServer.cpp ---
@@ -90,31 +95,71 @@ TThreadedServer::~TThreadedServer() {
 }
 
 void TThreadedServer::serve() {
+  threadFactory_->setDetached(false);
--- End diff --

I could require it to be detached in the constructor and throw an invalid 
argument if it is not, I suppose, rather than mutating the thread factory in 
serve.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-11 Thread jeking3
Github user jeking3 commented on a diff in the pull request:

https://github.com/apache/thrift/pull/980#discussion_r59306037
  
--- Diff: lib/cpp/src/thrift/server/TThreadedServer.h ---
@@ -83,12 +88,56 @@ class TThreadedServer : public TServerFramework {
   virtual void serve();
 
 protected:
+  /**
+   * Drain recently connected clients by joining their threads - this is 
done lazily because
+   * we cannot do it inside the thread context that is disconnecting.
+   */
+  virtual void drainDeadClients();
+
+  /**
+   * Implementation of TServerFramework::onClientConnected
+   */
   virtual void onClientConnected(const 
boost::shared_ptr& pClient) /* override */;
-  virtual void onClientDisconnected(TConnectedClient* pClient) /* override 
*/;
+
+  /**
+   * Implementation of TServerFramework::onClientDisconnected
+   */
+  virtual void onClientDisconnected(TConnectedClient *pClient) /* override 
*/;
 
   boost::shared_ptr 
threadFactory_;
-  apache::thrift::concurrency::Monitor clientsMonitor_;
+
+  /**
+   * A helper wrapper used to wrap the client in something we can use to 
maintain
+   * the lifetime of the connected client within a detached thread.
--- End diff --

You are correct, we can just store the thread instead of this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-11 Thread jeking3
Github user jeking3 commented on a diff in the pull request:

https://github.com/apache/thrift/pull/980#discussion_r59306085
  
--- Diff: lib/cpp/src/thrift/server/TThreadedServer.cpp ---
@@ -90,31 +95,71 @@ TThreadedServer::~TThreadedServer() {
 }
 
 void TThreadedServer::serve() {
+  threadFactory_->setDetached(false);
   TServerFramework::serve();
 
-  // Drain all clients - no more will arrive
-  try {
-Synchronized s(clientsMonitor_);
-while (getConcurrentClientCount() > 0) {
-  clientsMonitor_.wait();
-}
-  } catch (TException& tx) {
-string errStr = string("TThreadedServer: Exception joining workers: ") 
+ tx.what();
-GlobalOutput(errStr.c_str());
+  // Ensure post-condition of no active clients
+  Synchronized s(clientMonitor_);
+  while (!activeClientMap_.empty()) {
+clientMonitor_.wait();
+  }
+
+  drainDeadClients();
+}
+
+void TThreadedServer::drainDeadClients() {
+  // we're in a monitor here
+  while (!deadClientMap_.empty()) {
+ClientMap::iterator it = deadClientMap_.begin();
+it->second->join();
+deadClientMap_.erase(it);
   }
 }
 
 void TThreadedServer::onClientConnected(const 
shared_ptr& pClient) {
-  threadFactory_->newThread(pClient)->start();
+  Synchronized sync(clientMonitor_);
+  drainDeadClients();
--- End diff --

Sure we could do it that way.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-11 Thread tpcwang
Github user tpcwang commented on a diff in the pull request:

https://github.com/apache/thrift/pull/980#discussion_r59305327
  
--- Diff: lib/cpp/src/thrift/server/TThreadedServer.cpp ---
@@ -90,31 +95,71 @@ TThreadedServer::~TThreadedServer() {
 }
 
 void TThreadedServer::serve() {
+  threadFactory_->setDetached(false);
   TServerFramework::serve();
 
-  // Drain all clients - no more will arrive
-  try {
-Synchronized s(clientsMonitor_);
-while (getConcurrentClientCount() > 0) {
-  clientsMonitor_.wait();
-}
-  } catch (TException& tx) {
-string errStr = string("TThreadedServer: Exception joining workers: ") 
+ tx.what();
-GlobalOutput(errStr.c_str());
+  // Ensure post-condition of no active clients
+  Synchronized s(clientMonitor_);
+  while (!activeClientMap_.empty()) {
+clientMonitor_.wait();
+  }
+
+  drainDeadClients();
+}
+
+void TThreadedServer::drainDeadClients() {
+  // we're in a monitor here
+  while (!deadClientMap_.empty()) {
+ClientMap::iterator it = deadClientMap_.begin();
+it->second->join();
+deadClientMap_.erase(it);
   }
 }
 
 void TThreadedServer::onClientConnected(const 
shared_ptr& pClient) {
-  threadFactory_->newThread(pClient)->start();
+  Synchronized sync(clientMonitor_);
+  drainDeadClients();
--- End diff --

What do you think about draining in onClientDisconnected instead of 
onClientConnected (note to take care of draining before adding the thread that 
initiated onClientDisconnected to the list of dead clients)? I think it might 
be better to make onClientConnected as fast as possible because the serve 
thread is the caller?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-11 Thread tpcwang
Github user tpcwang commented on a diff in the pull request:

https://github.com/apache/thrift/pull/980#discussion_r59304879
  
--- Diff: lib/cpp/src/thrift/server/TThreadedServer.h ---
@@ -83,12 +88,56 @@ class TThreadedServer : public TServerFramework {
   virtual void serve();
 
 protected:
+  /**
+   * Drain recently connected clients by joining their threads - this is 
done lazily because
+   * we cannot do it inside the thread context that is disconnecting.
+   */
+  virtual void drainDeadClients();
+
+  /**
+   * Implementation of TServerFramework::onClientConnected
+   */
   virtual void onClientConnected(const 
boost::shared_ptr& pClient) /* override */;
-  virtual void onClientDisconnected(TConnectedClient* pClient) /* override 
*/;
+
+  /**
+   * Implementation of TServerFramework::onClientDisconnected
+   */
+  virtual void onClientDisconnected(TConnectedClient *pClient) /* override 
*/;
 
   boost::shared_ptr 
threadFactory_;
-  apache::thrift::concurrency::Monitor clientsMonitor_;
+
+  /**
+   * A helper wrapper used to wrap the client in something we can use to 
maintain
+   * the lifetime of the connected client within a detached thread.
--- End diff --

It's not a detached thread anymore, but why is this needed? I think 
ClientMap be a map from TConnectedClient* to 
shared_ptr because Thread itself holds on 
to the TConnectedClient and maintains its lifetime since it needs to run it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-11 Thread tpcwang
Github user tpcwang commented on a diff in the pull request:

https://github.com/apache/thrift/pull/980#discussion_r59304884
  
--- Diff: lib/cpp/src/thrift/server/TThreadedServer.cpp ---
@@ -90,31 +95,71 @@ TThreadedServer::~TThreadedServer() {
 }
 
 void TThreadedServer::serve() {
+  threadFactory_->setDetached(false);
--- End diff --

This is a functional change that is subtle and may break existing users of 
TThreadedServer. I don't have a better solution at the moment, but I do want to 
call this out to see if others have a better solution.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-11 Thread jeking3
Github user jeking3 commented on the pull request:

https://github.com/apache/thrift/pull/980#issuecomment-208513794
  
I'm starting to believe that the TThreadServerPool implementation may have 
the same issue.  The cores I have been able to produce through repeated testing 
are showing a pthread running when nothing else is, and it usually cores 
because it is trying to lock a mutex that has been destroyed.  I'm looking into 
this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-10 Thread jeking3
Github user jeking3 commented on the pull request:

https://github.com/apache/thrift/pull/980#issuecomment-208013575
  
I ran the test that failed in the apache build in a loop and finally ended 
up in a deadlock situation, so I am looking at that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-08 Thread jeking3
Github user jeking3 commented on the pull request:

https://github.com/apache/thrift/pull/980#issuecomment-207675424
  
I did not touch "go" or "py" implementations and that was the reason for 
the failure in #2467.2.  #2467.1 failed in three tests including go-cpp/json 
however given other go tests are failing it looks like the issue is likely in 
"go".


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-07 Thread jeking3
Github user jeking3 commented on a diff in the pull request:

https://github.com/apache/thrift/pull/980#discussion_r58851243
  
--- Diff: lib/cpp/src/thrift/server/TThreadedServer.cpp ---
@@ -92,29 +97,42 @@ TThreadedServer::~TThreadedServer() {
 void TThreadedServer::serve() {
   TServerFramework::serve();
 
-  // Drain all clients - no more will arrive
-  try {
-Synchronized s(clientsMonitor_);
-while (getConcurrentClientCount() > 0) {
-  clientsMonitor_.wait();
-}
-  } catch (TException& tx) {
-string errStr = string("TThreadedServer: Exception joining workers: ") 
+ tx.what();
-GlobalOutput(errStr.c_str());
+  // Ensure post-condition of no active clients
+  Synchronized s(clientMonitor_);
+  while (!clientMap_.empty()) {
+clientMonitor_.wait();
   }
 }
 
 void TThreadedServer::onClientConnected(const 
shared_ptr& pClient) {
-  threadFactory_->newThread(pClient)->start();
+  Synchronized sync(clientMonitor_);
+  clientMap_.insert(ClientMap::value_type(pClient.get(), 
boost::make_shared(pClient)));
+
+  // We do not track the threads themselves
+  ClientMap::const_iterator it = clientMap_.find(pClient.get());
+  threadFactory_->newThread(it->second)->start();
 }
 
 void TThreadedServer::onClientDisconnected(TConnectedClient* pClient) {
-  THRIFT_UNUSED_VARIABLE(pClient);
-  Synchronized s(clientsMonitor_);
-  if (getConcurrentClientCount() == 0) {
-clientsMonitor_.notify();
+  Synchronized sync(clientMonitor_);
+  clientMap_.erase(pClient);
+  if (clientMap_.empty()) {
--- End diff --

Understood.  I'm working on that now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-06 Thread tpcwang
Github user tpcwang commented on a diff in the pull request:

https://github.com/apache/thrift/pull/980#discussion_r58814042
  
--- Diff: lib/cpp/src/thrift/server/TThreadedServer.cpp ---
@@ -92,29 +97,42 @@ TThreadedServer::~TThreadedServer() {
 void TThreadedServer::serve() {
   TServerFramework::serve();
 
-  // Drain all clients - no more will arrive
-  try {
-Synchronized s(clientsMonitor_);
-while (getConcurrentClientCount() > 0) {
-  clientsMonitor_.wait();
-}
-  } catch (TException& tx) {
-string errStr = string("TThreadedServer: Exception joining workers: ") 
+ tx.what();
-GlobalOutput(errStr.c_str());
+  // Ensure post-condition of no active clients
+  Synchronized s(clientMonitor_);
+  while (!clientMap_.empty()) {
+clientMonitor_.wait();
   }
 }
 
 void TThreadedServer::onClientConnected(const 
shared_ptr& pClient) {
-  threadFactory_->newThread(pClient)->start();
+  Synchronized sync(clientMonitor_);
+  clientMap_.insert(ClientMap::value_type(pClient.get(), 
boost::make_shared(pClient)));
+
+  // We do not track the threads themselves
+  ClientMap::const_iterator it = clientMap_.find(pClient.get());
+  threadFactory_->newThread(it->second)->start();
 }
 
 void TThreadedServer::onClientDisconnected(TConnectedClient* pClient) {
-  THRIFT_UNUSED_VARIABLE(pClient);
-  Synchronized s(clientsMonitor_);
-  if (getConcurrentClientCount() == 0) {
-clientsMonitor_.notify();
+  Synchronized sync(clientMonitor_);
+  clientMap_.erase(pClient);
+  if (clientMap_.empty()) {
--- End diff --

This doesn't fix the issue because as soon as you notify clientMonitor_, 
TThreadedServer::serve() will exit and clients are free to release 
TThreadedServer, but the thread that initiated onClientDisconnected is still in 
the middle of disposeConnectedClient, which it should not be executing any more 
because the object is already destroyed.

Try adding one second sleep after the delete in disposeConnectedClient and 
run the tests. I see the same segfault in TServerIntegrationTest. I am not sure 
we can get away with not joining the threads.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] thrift pull request: THRIFT-3768: ensure TThreadedServer guarantee...

2016-04-06 Thread jeking3
GitHub user jeking3 opened a pull request:

https://github.com/apache/thrift/pull/980

THRIFT-3768: ensure TThreadedServer guarantees the lifetime of the client

TThreadedServer now guarantees the lifetime of the TConnectedClient.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jeking3/thrift defect/THRIFT-3768-alternate

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/thrift/pull/980.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #980


commit 7132ec16086daf2b2808e3a91f78ecfbca6a5dab
Author: Jim King 
Date:   2016-04-05T16:17:51Z

THRIFT-3768: ensure TThreadedServer guarantees the lifetime of the client




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---