When you say you configured them to talk to .0.31 as a seed, did you do that by changing the yaml?
Was 0.9 ever a seed before? I expect if you start 0.7 and 0.9 at the same time, it all works. This looks like a logic/state bug that needs to be fixed, though. (If you're going to upgrade, usually you start with all 3 hosts up, and restart one at a time. Starting with 0 online is likely poorly tested, and we should fix that). On Wed, Nov 9, 2022 at 7:08 AM Klein, Benjamin E (PERATON) < benjamin.e.kl...@peraton.com> wrote: > I am trying to upgrade a three-node Cassandra cluster (192.168.0.31, > 192.168.0.7, and 192.168.0.9) from 3.11 to 4.0.3. At the start of the > process, all three nodes are down. I have configured all three nodes to > have 192.168.0.31:7000 as their only seed. > > I am trying to bring all three nodes up, one at a time. Starting Node 1 > (.31) works just fine. However, Node 2 (.7) fails to start with the error > message "Unable to gossip with any peers". The configuration file and log > from Node 2 are attached (the log has had lines related to loading > individual tables snipped); the relevant portion of the log is at the > bottom of this message. Note that this node was able to successfully > connect to the other seed node. > > I have already tried the following unsuccessfully: > > * Starting with a completely blank (i.e., newly formatted) /data drive on > all nodes. This worked fine the first time the cluster started; however, > attempting to restart the cluster gives the same error. > * Ensuring that all clocks are synchronized to the same NTP servers, which > have a ping time to all three nodes of approximately 0.5-1.0ms > * Setting the cross_node_timeout configuration entry to false > * Setting the internode_tcp_connect_timeout_in_ms configuration entry to > 20000 > * Adding an entry for each node in its /etc/hosts file (e.g., Node 1 gets > the entry "192.168.0.31 node-1") > > Is there anything else I should try? > > --- > Relevant portion of Cassandra log: > INFO [main] 2022-11-04 16:57:02,541 StorageService.java:755 - Loading > persisted ring state > INFO [main] 2022-11-04 16:57:02,541 StorageService.java:838 - Populating > token metadata from system tables > INFO [GossipStage:1] 2022-11-04 16:57:02,570 Gossiper.java:1969 - Adding / > 192.168.0.31:7000 as there was no previous epState; new state is > EndpointState: HeartBeatState = HeartBeat: generation = 0, version = -1, > AppStateMap = {} > INFO [GossipStage:1] 2022-11-04 16:57:02,570 Gossiper.java:1969 - Adding / > 192.168.0.9:7000 as there was no previous epState; new state is > EndpointState: HeartBeatState = HeartBeat: generation = 0, version = -1, > AppStateMap = {} > INFO [main] 2022-11-04 16:57:02,705 InboundConnectionInitiator.java:127 - > Listening on address: (/192.168.0.7:7000), nic: eth0, encryption: > unencrypted > INFO [Messaging-EventLoop-3-3] 2022-11-04 16:57:02,993 > OutboundConnection.java:1150 - /192.168.0.7:7000(/192.168.0.7:55882 > )->/192.168.0.31:7000-URGENT_MESSAGES-ef0bde62 successfully connected, > version = 12, framing = CRC, encryption = unencrypted > INFO [Messaging-EventLoop-3-6] 2022-11-04 16:57:07,938 > NoSpamLogger.java:92 - > /192.168.0.7:7000->/192.168.0.9:7000-URGENT_MESSAGES-[no-channel] > failed to connect > io.netty.channel.AbstractChannel$AnnotatedConnectException: > finishConnect(..) failed: Connection refused: /192.168.0.9:7000 > Caused by: java.net.ConnectException: finishConnect(..) failed: Connection > refused > at io.netty.channel.unix.Errors.throwConnectException(Errors.java:124) > at io.netty.channel.unix.Socket.finishConnect(Socket.java:251) > at > io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.doFinishConnect(AbstractEpollChannel.java:673) > at > io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:650) > at > io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:530) > at > io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:470) > at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) > at > io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) > at > io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.base/java.lang.Thread.run(Thread.java:829) > Exception (java.lang.RuntimeException) encountered during startup: Unable > to gossip with any peers > java.lang.RuntimeException: Unable to gossip with any peers > at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1844) > at > org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:650) > at > org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:936) > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:786) > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:731) > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:420) > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:765) > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:889) > ERROR [main] 2022-11-04 16:58:03,943 CassandraDaemon.java:911 - Exception > encountered during startup > java.lang.RuntimeException: Unable to gossip with any peers > at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1844) > at > org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:650) > at > org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:936) > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:786) > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:731) > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:420) > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:765) > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:889) > INFO [StorageServiceShutdownHook] 2022-11-04 16:58:03,953 > HintsService.java:222 - Paused hints dispatch > WARN [StorageServiceShutdownHook] 2022-11-04 16:58:03,954 > Gossiper.java:2032 - No local state, state is in silent shutdown, or node > hasn't joined, not announcing shutdown >