ANSHUL SAINI created CASSANDRA-18771: ----------------------------------------
Summary: Cassandra 4.0.5 nodes fails to start when replacing dead node Key: CASSANDRA-18771 URL: https://issues.apache.org/jira/browse/CASSANDRA-18771 Project: Cassandra Issue Type: Bug Components: Cluster/Gossip Reporter: ANSHUL SAINI Trying to replace a down node the new nodes fail to start, using property {_}*replace_address*{_}. Below message appears continuously in system logs. {noformat} WARN [Messaging-EventLoop-3-2] 2023-08-16 14:18:58,565 NoSpamLogger.java:95 - /xxx.xxx.xxx.xxx:7000->/yyy.yyy.yyy.yyy:7000-URGENT_MESSAGES-[no-channel] dropping message of type GOSSIP_DIGEST_SYN whose timeout expired before reaching the network INFO [Messaging-EventLoop-3-2] 2023-08-16 14:19:23,910 NoSpamLogger.java:92 - /xxx.xxx.xxx.xxx->/yyy.yyy.yyy.yyy:7000-URGENT_MESSAGES-[no-channel] failed to connect io.netty.channel.ConnectTimeoutException: connection timed out: /xxx.xxx.xxx.xxx:7000 at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe$2.run(AbstractEpollChannel.java:576) at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:384) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Thread.java:748) {noformat} xxx.xxx.xxx.xxx - IP of down node yyy.yyy.yyy.yyy - IP of new node NO other ERROR/WARNING appears in logs. The node goes into UJ state, but never joins the ring. While this doesn't happen always, but we are seeing this increased behaviour since upgrading from 3.11.9 to 4.0.5. Configuration are all fine as to mitigate this we terminate the node and spawn a new one with same configs. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org