When you say you configured them to talk to .0.31 as a seed, did you do
that by changing the yaml?

Was 0.9 ever a seed before?

I expect if you start 0.7 and 0.9 at the same time, it all works. This
looks like a logic/state bug that needs to be fixed, though.

(If you're going to upgrade, usually you start with all 3 hosts up, and
restart one at a time. Starting with 0 online is likely poorly tested, and
we should fix that).



On Wed, Nov 9, 2022 at 7:08 AM Klein, Benjamin E (PERATON) <
benjamin.e.kl...@peraton.com> wrote:

> I am trying to upgrade a three-node Cassandra cluster (192.168.0.31,
> 192.168.0.7, and 192.168.0.9) from 3.11 to 4.0.3. At the start of the
> process, all three nodes are down. I have configured all three nodes to
> have 192.168.0.31:7000 as their only seed.
>
> I am trying to bring all three nodes up, one at a time. Starting Node 1
> (.31) works just fine. However, Node 2 (.7) fails to start with the error
> message "Unable to gossip with any peers". The configuration file and log
> from Node 2 are attached (the log has had lines related to loading
> individual tables snipped); the relevant portion of the log is at the
> bottom of this message. Note that this node was able to successfully
> connect to the other seed node.
>
> I have already tried the following unsuccessfully:
>
> * Starting with a completely blank (i.e., newly formatted) /data drive on
> all nodes. This worked fine the first time the cluster started; however,
> attempting to restart the cluster gives the same error.
> * Ensuring that all clocks are synchronized to the same NTP servers, which
> have a ping time to all three nodes of approximately 0.5-1.0ms
> * Setting the cross_node_timeout configuration entry to false
> * Setting the internode_tcp_connect_timeout_in_ms configuration entry to
> 20000
> * Adding an entry for each node in its /etc/hosts file (e.g., Node 1 gets
> the entry "192.168.0.31 node-1")
>
> Is there anything else I should try?
>
> ---
> Relevant portion of Cassandra log:
> INFO  [main] 2022-11-04 16:57:02,541 StorageService.java:755 - Loading
> persisted ring state
> INFO  [main] 2022-11-04 16:57:02,541 StorageService.java:838 - Populating
> token metadata from system tables
> INFO  [GossipStage:1] 2022-11-04 16:57:02,570 Gossiper.java:1969 - Adding /
> 192.168.0.31:7000 as there was no previous epState; new state is
> EndpointState: HeartBeatState = HeartBeat: generation = 0, version = -1,
> AppStateMap = {}
> INFO  [GossipStage:1] 2022-11-04 16:57:02,570 Gossiper.java:1969 - Adding /
> 192.168.0.9:7000 as there was no previous epState; new state is
> EndpointState: HeartBeatState = HeartBeat: generation = 0, version = -1,
> AppStateMap = {}
> INFO  [main] 2022-11-04 16:57:02,705 InboundConnectionInitiator.java:127 -
> Listening on address: (/192.168.0.7:7000), nic: eth0, encryption:
> unencrypted
> INFO  [Messaging-EventLoop-3-3] 2022-11-04 16:57:02,993
> OutboundConnection.java:1150 - /192.168.0.7:7000(/192.168.0.7:55882
> )->/192.168.0.31:7000-URGENT_MESSAGES-ef0bde62 successfully connected,
> version = 12, framing = CRC, encryption = unencrypted
> INFO  [Messaging-EventLoop-3-6] 2022-11-04 16:57:07,938
> NoSpamLogger.java:92 - 
> /192.168.0.7:7000->/192.168.0.9:7000-URGENT_MESSAGES-[no-channel]
> failed to connect
> io.netty.channel.AbstractChannel$AnnotatedConnectException:
> finishConnect(..) failed: Connection refused: /192.168.0.9:7000
> Caused by: java.net.ConnectException: finishConnect(..) failed: Connection
> refused
> at io.netty.channel.unix.Errors.throwConnectException(Errors.java:124)
> at io.netty.channel.unix.Socket.finishConnect(Socket.java:251)
> at
> io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.doFinishConnect(AbstractEpollChannel.java:673)
> at
> io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:650)
> at
> io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:530)
> at
> io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:470)
> at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378)
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
> at
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
> at
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:829)
> Exception (java.lang.RuntimeException) encountered during startup: Unable
> to gossip with any peers
> java.lang.RuntimeException: Unable to gossip with any peers
> at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1844)
> at
> org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:650)
> at
> org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:936)
> at
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:786)
> at
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:731)
> at
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:420)
> at
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:765)
> at
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:889)
> ERROR [main] 2022-11-04 16:58:03,943 CassandraDaemon.java:911 - Exception
> encountered during startup
> java.lang.RuntimeException: Unable to gossip with any peers
> at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1844)
> at
> org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:650)
> at
> org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:936)
> at
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:786)
> at
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:731)
> at
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:420)
> at
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:765)
> at
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:889)
> INFO  [StorageServiceShutdownHook] 2022-11-04 16:58:03,953
> HintsService.java:222 - Paused hints dispatch
> WARN  [StorageServiceShutdownHook] 2022-11-04 16:58:03,954
> Gossiper.java:2032 - No local state, state is in silent shutdown, or node
> hasn't joined, not announcing shutdown
>

Reply via email to