[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-56?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira reassigned BOOKKEEPER-56:
------------------------------------------

    Assignee: Sijie Guo  (was: Gavin Li)
    
> Race condition of message handler in connection recovery in Hedwig client
> -------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-56
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-56
>             Project: Bookkeeper
>          Issue Type: Bug
>          Components: hedwig-client
>    Affects Versions: 4.0.0
>            Reporter: Gavin Li
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: patch_56
>
>
> There's a race condition in the connection recovery logic in Hedwig client. 
> The message handler user set might be overwritten incorrectly. 
> When handling channelDisconnected event, we try to reconnect to Hedwig 
> server. After the connection is created and subscribed, we'll call 
> StartDelivery() to recover the message handler to the original one of the 
> disconnected connection. But if during this process, user calls 
> StartDelivery() to set a new message handler, it will get overwritten to the 
> original one.
> The process can be demonstrated as below:
> || main thread || netty worker thread ||
> | StartDelivery(messageHandlerA) | |
> | (connection Broken here, and recovered later...) |
> |                                             | 
> ResponseHandler::channelDisconnected()   (connection disconnected event 
> received) |
> |                                             | new 
> SubscribeReconnectCallback(subHandler.getMessageHandler()) (store 
> messageHandlerA in SubscribeReconnectCallback to recover later) |
> |                                             | client.doConnect() (try 
> reconnect)  |
> |                                             | doSubUnsub() (resubscribe) |
> |                                             | 
> SubscriberResponseHandler::handleSubscribeResponse()  (subscription succeeds) 
> |
> | StartDelivery(messageHandlderB)             |                               
>                                                 |
> |                                             | 
> SubscribeReconnectCallback::operationFinished()                               
> |
> |                                             | StartDelvery(messageHandlerA) 
>   (messageHandler get overwritten)              |   
> I can stably reproduce this by simulating this race condition by put some 
> sleep in ResponseHandler.
> I think essentially speaking we should not store messageHandler in 
> ResponseHandler, since the message handler is supposed to be bound to 
> connection. Instead, no matter which connection is in use, we should use the 
> same messageHandler, the one user set last time. So I think we should change 
> to store messageHandler in the HedwigSubscriber, in this way we don't need to 
> recover the handler in connection recovery and thus won't face this race 
> condition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to