[
https://issues.apache.org/jira/browse/GEODE-9522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17418943#comment-17418943
]
Xiaojian Zhou commented on GEODE-9522:
--------------------------------------
A detailed description of the issue and fix is here:
When the membership received a request to remove itself from locator (this
could be triggered by playDead), it will call GMSMembership.forceDisconnect()
to close its DM then cache.
However, the uncleanShutdownDS() which is running in DisconnectThread could
take a few seconds (6-10) to close all the connections before it set
shutdownCause in DM (which is used to trigger cancel exception in c/s request
or distribution). Since it's running in a separate thread, the membership
thought the forceDisconnect() is done while it's still closing connections.
During this time windows, a client could re-initiate a ServerConnection, which
could put event to cache and HARegionQueue. Those ServerConnection should be
prevented in this time window because it will cause data mismatch.
The idea to fix is to set shutdownCause in DM as early as possible (before
closing connection). It will prevent the incoming AcceptorImpl socket request
by triggering cancel exception.
> When a server is force disconnected, it should set shutdown cause for dm to
> prevent clients recreating server connection.
> -------------------------------------------------------------------------------------------------------------------------
>
> Key: GEODE-9522
> URL: https://issues.apache.org/jira/browse/GEODE-9522
> Project: Geode
> Issue Type: Bug
> Reporter: Xiaojian Zhou
> Priority: Major
> Labels: pull-request-available
>
> When a client is doing puts (mainly creates) to servers with replicated
> region, shutdown some servers to force switching of primary HARegionQueue,
> sometimes, the event with later event id is distributed by previous primary
> HARegionQueue, which caused the events with earlier event ids are rejected by
> clients.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)