[
https://issues.apache.org/jira/browse/THRIFT-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15224493#comment-15224493
]
James E. King, III edited comment on THRIFT-3768 at 4/4/16 6:28 PM:
--------------------------------------------------------------------
The root cause is a lack of synchronization between serve() and stop() in
TServerFramework, along with an incorrect ordering of operations in
disposeConnectedClient. I have fixed both of these and I am testing it.
It was possible for serve() to be partially complete with a loop while stop()
had been called.
It was also possible for serve() to finish while a client was in
newlyConnectedClient as we performed onClientConnected before we accounted for
the connection.
As I started testing this fix however I realized that if TThreadedServer
doesn't hold on to the smart pointer that is created in
TServerFramework::serve(), it will be destroyed immediately in the serve() loop
itself, so you are correct in that TThreadedServer must hold a reference to
that connection, so I am fixing that too.
was (Author: jking3):
The root cause is a lack of synchronization between serve() and stop() in
TServerFramework, along with an incorrect ordering of operations in
disposeConnectedClient. I have fixed both of these and I am testing it.
It was possible for serve() to be partially complete with a loop while stop()
had been called.
It was also possible for serve() to finish while a client was in
newlyConnectedClient as we performed onClientConnected before we accounted for
the connection.
> TThreadedServer may crash if it is destroyed immediately after it returns
> from serve()
> --------------------------------------------------------------------------------------
>
> Key: THRIFT-3768
> URL: https://issues.apache.org/jira/browse/THRIFT-3768
> Project: Thrift
> Issue Type: Bug
> Components: C++ - Library
> Affects Versions: 0.9.3
> Reporter: Ted Wang
> Assignee: James E. King, III
> Priority: Minor
>
> Here's a sequence that shows the race:
> Thread-1 (Users of TThreadedServer): Calls TThreadedServer::stop(), which
> calls interruptChildren and initiates the tearing down of client connections.
> Thread-2: In TServerFramework::serve(), broke out of accept, and now blocks
> in TThreadedServer::serve() waiting to drain all the clients.
> Thread-3 (The connected client thread created by TThreadedServer): In
> disposeConnectedClient, running because the server is shutting down and the
> shared_ptr specified this function to be the cleanup function for the client.
> This thread just returned from onClientDisconnected and now context switches.
> Thread-2: TThreadedServer::serve() is notified that all of the clients have
> disconnected and completes.
> Thread-1: Joins on Thread-2 and destroys the server object because it is done.
> Thread-3: Finally gets a chance to run, but now encounters undefined behavior
> because it is still executing a member function of an object that has been
> destroyed.
> You can force this race in action if you put sleep(1) before
> onClientDisconnected() in disposeConnectedClient
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)