ZhaoYang created CASSANDRA-14930: ------------------------------------ Summary: decommission may cause timeout because messaging backlog is cleared Key: CASSANDRA-14930 URL: https://issues.apache.org/jira/browse/CASSANDRA-14930 Project: Cassandra Issue Type: Bug Components: Coordination, Core Reporter: ZhaoYang Assignee: ZhaoYang Fix For: 3.0.x, 3.11.x
On a 3-node cluster with RF=2, decommissioning a node may cause quorum write timeout because messaging backlog to decommissioned node is cleared via {{Gossiper#removeEndpoint() -> OutboundTcpConnection#closeSocket()}}. (Timeout is less likely to happen with RF=3, because we can afford one less response) {code:java} What happened: 1. [WriteStage] before the leaving node is removed from tokenmetadata, the write endpoints are generated ( leaving endpoint is included ) 2. [GossipStage] the leaving node is removed from tokenmetadata, no more future write handler will include leaving endpoints 3. [WriteStage] write handlers sends messages to messaging-service backlog 4. [GossipStage] messaging-service backlog is cleared, messages are not sent and connection closed 5. [WriteStage] write time out {code} | patch | | [3.0|https://github.com/jasonstack/cassandra/commits/decommission_timeout_3.0] | | [3.0|https://github.com/jasonstack/cassandra/commits/decommission_timeout_3.11] | We can avoid it by delaying to destroy messaging connection so that messages are sent and responded. New messaging framework rewrite in {{Trunk}} avoids the issues by not clearing messaging backlog. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org