[ https://issues.apache.org/jira/browse/THRIFT-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15224493#comment-15224493 ]
James E. King, III edited comment on THRIFT-3768 at 4/4/16 6:48 PM: -------------------------------------------------------------------- Yuck.. TThreadedServer doesn't actually follow the interface requirements of TServerFramework. The onClientConnected() method is documented to require the implementation to store the connected client and manage the lifetime of it. TThreadedServer fires off a thread to run the client but does not guarantee the lifetime. As a result we delete the client immediately in the middle of TServerFramework::serve() because it is the last reference to it. I'm going to first see about adding an assertion to verify that the implementation actually bumped the reference count on the smart pointer for safety. After that, though, I'll have to change TThreadedServer so that it passes the client into a routine that runs the client and maintains the lifetime of the client. I'd like to get rid of TThreadedServer however... perhaps we can deprecate it in 0.10.0? Using a TThreadPoolServer is a better choice. was (Author: jking3): The root cause is a lack of synchronization between serve() and stop() in TServerFramework, along with an incorrect ordering of operations in disposeConnectedClient. I have fixed both of these and I am testing it. It was possible for serve() to be partially complete with a loop while stop() had been called. It was also possible for serve() to finish while a client was in newlyConnectedClient as we performed onClientConnected before we accounted for the connection. As I started testing this fix however I realized that if TThreadedServer doesn't hold on to the smart pointer that is created in TServerFramework::serve(), it will be destroyed immediately in the serve() loop itself, so you are correct in that TThreadedServer must hold a reference to that connection, so I am fixing that too. > TThreadedServer may crash if it is destroyed immediately after it returns > from serve() > -------------------------------------------------------------------------------------- > > Key: THRIFT-3768 > URL: https://issues.apache.org/jira/browse/THRIFT-3768 > Project: Thrift > Issue Type: Bug > Components: C++ - Library > Affects Versions: 0.9.3 > Reporter: Ted Wang > Assignee: James E. King, III > Priority: Minor > > Here's a sequence that shows the race: > Thread-1 (Users of TThreadedServer): Calls TThreadedServer::stop(), which > calls interruptChildren and initiates the tearing down of client connections. > Thread-2: In TServerFramework::serve(), broke out of accept, and now blocks > in TThreadedServer::serve() waiting to drain all the clients. > Thread-3 (The connected client thread created by TThreadedServer): In > disposeConnectedClient, running because the server is shutting down and the > shared_ptr specified this function to be the cleanup function for the client. > This thread just returned from onClientDisconnected and now context switches. > Thread-2: TThreadedServer::serve() is notified that all of the clients have > disconnected and completes. > Thread-1: Joins on Thread-2 and destroys the server object because it is done. > Thread-3: Finally gets a chance to run, but now encounters undefined behavior > because it is still executing a member function of an object that has been > destroyed. > You can force this race in action if you put sleep(1) before > onClientDisconnected() in disposeConnectedClient -- This message was sent by Atlassian JIRA (v6.3.4#6332)