[
https://issues.apache.org/jira/browse/BOOKKEEPER-56?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Flavio Junqueira reassigned BOOKKEEPER-56:
------------------------------------------
Assignee: Sijie Guo (was: Gavin Li)
> Race condition of message handler in connection recovery in Hedwig client
> -------------------------------------------------------------------------
>
> Key: BOOKKEEPER-56
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-56
> Project: Bookkeeper
> Issue Type: Bug
> Components: hedwig-client
> Affects Versions: 4.0.0
> Reporter: Gavin Li
> Assignee: Sijie Guo
> Fix For: 4.1.0
>
> Attachments: patch_56
>
>
> There's a race condition in the connection recovery logic in Hedwig client.
> The message handler user set might be overwritten incorrectly.
> When handling channelDisconnected event, we try to reconnect to Hedwig
> server. After the connection is created and subscribed, we'll call
> StartDelivery() to recover the message handler to the original one of the
> disconnected connection. But if during this process, user calls
> StartDelivery() to set a new message handler, it will get overwritten to the
> original one.
> The process can be demonstrated as below:
> || main thread || netty worker thread ||
> | StartDelivery(messageHandlerA) | |
> | (connection Broken here, and recovered later...) |
> | |
> ResponseHandler::channelDisconnected() (connection disconnected event
> received) |
> | | new
> SubscribeReconnectCallback(subHandler.getMessageHandler()) (store
> messageHandlerA in SubscribeReconnectCallback to recover later) |
> | | client.doConnect() (try
> reconnect) |
> | | doSubUnsub() (resubscribe) |
> | |
> SubscriberResponseHandler::handleSubscribeResponse() (subscription succeeds)
> |
> | StartDelivery(messageHandlderB) |
> |
> | |
> SubscribeReconnectCallback::operationFinished()
> |
> | | StartDelvery(messageHandlerA)
> (messageHandler get overwritten) |
> I can stably reproduce this by simulating this race condition by put some
> sleep in ResponseHandler.
> I think essentially speaking we should not store messageHandler in
> ResponseHandler, since the message handler is supposed to be bound to
> connection. Instead, no matter which connection is in use, we should use the
> same messageHandler, the one user set last time. So I think we should change
> to store messageHandler in the HedwigSubscriber, in this way we don't need to
> recover the handler in connection recovery and thus won't face this race
> condition.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira