[ 
https://issues.apache.org/jira/browse/THRIFT-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15224493#comment-15224493
 ] 

James E. King, III edited comment on THRIFT-3768 at 4/4/16 6:28 PM:
--------------------------------------------------------------------

The root cause is a lack of synchronization between serve() and stop() in 
TServerFramework, along with an incorrect ordering of operations in 
disposeConnectedClient.  I have fixed both of these and I am testing it.

It was possible for serve() to be partially complete with a loop while stop() 
had been called.
It was also possible for serve() to finish while a client was in 
newlyConnectedClient as we performed onClientConnected before we accounted for 
the connection.

As I started testing this fix however I realized that if TThreadedServer 
doesn't hold on to the smart pointer that is created in 
TServerFramework::serve(), it will be destroyed immediately in the serve() loop 
itself, so you are correct in that TThreadedServer must hold a reference to 
that connection, so I am fixing that too.


was (Author: jking3):
The root cause is a lack of synchronization between serve() and stop() in 
TServerFramework, along with an incorrect ordering of operations in 
disposeConnectedClient.  I have fixed both of these and I am testing it.

It was possible for serve() to be partially complete with a loop while stop() 
had been called.
It was also possible for serve() to finish while a client was in 
newlyConnectedClient as we performed onClientConnected before we accounted for 
the connection.

> TThreadedServer may crash if it is destroyed immediately after it returns 
> from serve()
> --------------------------------------------------------------------------------------
>
>                 Key: THRIFT-3768
>                 URL: https://issues.apache.org/jira/browse/THRIFT-3768
>             Project: Thrift
>          Issue Type: Bug
>          Components: C++ - Library
>    Affects Versions: 0.9.3
>            Reporter: Ted Wang
>            Assignee: James E. King, III
>            Priority: Minor
>
> Here's a sequence that shows the race:
> Thread-1 (Users of TThreadedServer): Calls TThreadedServer::stop(), which 
> calls interruptChildren and initiates the tearing down of client connections.
> Thread-2: In TServerFramework::serve(), broke out of accept, and now blocks 
> in TThreadedServer::serve() waiting to drain all the clients.
> Thread-3 (The connected client thread created by TThreadedServer): In 
> disposeConnectedClient, running because the server is shutting down and the 
> shared_ptr specified this function to be the cleanup function for the client. 
> This thread just returned from onClientDisconnected and now context switches.
> Thread-2: TThreadedServer::serve() is notified that all of the clients have 
> disconnected and completes.
> Thread-1: Joins on Thread-2 and destroys the server object because it is done.
> Thread-3: Finally gets a chance to run, but now encounters undefined behavior 
> because it is still executing a member function of an object that has been 
> destroyed.
> You can force this race in action if you put sleep(1) before 
> onClientDisconnected() in disposeConnectedClient



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to