Re: Issue replacing a dead node

Courtney Fri, 23 May 2025 22:46:23 -0700

Some updates after getting back to this. I did hardware tests and couldnot find any hardware issues. Instead of trying a replace, I went theroute of removing the dead node entirely and then adding in a new node.

The new node is still joining, but I am hitting some oddities in thelog. When joining, the process seems to "rescheduling" around the timethat these logs appear:

INFO [Messaging-EventLoop-3-25] 2025-05-24 05:20:38,203NoSpamLogger.java:105 -/<new-node>:7000->/<dead-node>:7000-SMALL_MESSAGES-[no-channel] failedto connectio.netty.channel.ConnectTimeoutException: connection timed out:/<dead-node>:7000WARN [Messaging-EventLoop-3-25] 2025-05-24 05:20:46,200NoSpamLogger.java:108 -/<new-node>:7000->/<dead-node>:7000-SMALL_MESSAGES-[no-channel] droppingmessage of type PING_REQ whose timeout expired before reaching the networkINFO [Messaging-EventLoop-3-26] 2025-05-24 05:32:27,606NoSpamLogger.java:105 -/<new-node>:7000->/<dead-node>:7000-LARGE_MESSAGES-[no-channel] failedto connectio.netty.channel.ConnectTimeoutException: connection timed out:/<dead-node>:7000INFO [GossipStage:1] 2025-05-24 05:33:15,538 Gossiper.java:1428 -InetAddress /<dead-node>:7000 is now DOWN

The cluster knows 0 about the old node, the old node is completelypowered off. I don't know why it is attempting to connect to the deadnode that has no presence in the cluster.


Beforehand, I start to get these messages:

INFO [OptionalTasks:1] 2025-05-24 05:27:46,189 NoSpamLogger.java:105 -"Cannot read from a bootstrapping node" while executing SELECT * FROMsystem_auth.roles WHERE role = 'cassandra' ALLOW FILTERINGWARN [OptionalTasks:1] 2025-05-24 05:32:16,240CassandraRoleManager.java:359 - CassandraRoleManager skipped defaultrole setup: some nodes were not readyINFO [OptionalTasks:1] 2025-05-24 05:32:16,240CassandraRoleManager.java:395 - Setup task failed with error, rescheduling

I've had to stop and start cassandra to get passed this, but I am afraidI will hit this again soon.


On 5/16/25 11:54 PM, Sebastian Marsching wrote:

To add on to what Bowen already wrote, if you cannot find any reason in the 
logs at all, I would retry using different hardware.

In the recent past I have seen two cases where strange Cassandra problems were 
actually caused by broken hardware (in both cases, a faulty memory module 
caused the issues). In one case, there were log messages, but misleading ones 
about SSTable corruption (the SSTables were fine, but when loaded into memory, 
the data got corrupted). In the other case, there were no log messages at all. 
The Cassandra process simply stopped without a good reason. Eventually I found 
the crash dumps in the Cassandra data directory (whether they are written 
depends on the JVM setting), which indicated that the JVM experienced a 
segmentation fault.

So, if you have any spare hardware lying around (it’s best if it is from a 
different batch to exclude common-mode failures), using a different piece of 
hardware and trying with that one might make sense. If the problem stays, you 
can at least be sure that it isn’t related to the hardware, and if it vanishes, 
you can further inspect the original hardware.


Am 17.05.2025 um 04:27 schrieb Bowen Song via user <user@cassandra.apache.org>:

In my experience, failed bootstrap / node replacement always leave some traces 
in the logs. At the very minimal, there's going to be logs about streaming 
sessions failing or aborting. I have never seen it silently fails or stops 
without leaving any traces in the log. I can't think of anything that can cause 
the process to fail and doesn't leave a trace in the log. BTW, the relevant 
logs can be hours before the symptom becomes visible, because a failed 
streaming session does not cause Cassandra to immediately abort other active 
streaming sessions, and the remaining active sessions can take a while to 
complete.

If the process repeatedly fails at a certain place, I would suspect some sort 
of data corruption or disk error, resulting in the data cannot be read or 
deserialised correctly. But this is just a guess, and I could be wrong.

On 16/05/2025 01:14, Courtney wrote:

I checked all the logs and really couldn't find anything. I couldn't find any 
sort of errors in dmesg, system.log, debug.log, gc.log (maybe up the log 
level?), systemd journal...the logs are totally clean. It just stops gossiping 
all of a sudden at 22GB of data each time, then the old node returning to DN 
state. What is `nodetool bootstrap resume` going to do? Is there a risk to 
running resume when the replacement node is no longer in the cluster? Could too 
high of a tombstone ratio cause this?

On 5/15/25 5:08 PM, Bowen Song via user wrote:

The dead node being replaced went back to DN state indicating the new 
replacement node failed to join the cluster, usually because the streaming was 
interrupted (e.g. by network issues, or long STW GC pauses). I would start 
looking for red flags in the logs, including Cassandra's logs, GC logs, dmesg, 
systemd journal, etc., on the new node, and other nodes in the cluster too. 
Also, I would try `nodetool bootstrap resume` on the replacement node.


On 12/05/2025 09:53, Courtney wrote:

Hello everyone,

I have a cluster with 2 datacenters. I am using GossipingPropertyFileSnitch as 
my endpoint snitch. Cassandra version 4.1.8. One datacenter is fully Ubuntu 
24.04 and OpenJDK 11 and another is Ubuntu 20.04 on OpenJDK 8. A seed node died 
in my second DC running Ubuntu 20.04 hosts. I ordered a new dedicated server. I 
updated my seeds to forget the dead seed node. I did the steps to replace a 
dead node

JVM_OPTS="$JVM_OPTS $JVM_EXTRA_OPTS 
-Dcassandra.replace_address_first_boot=<dead_node_ip>"

Configs between the old/new node are identical minus IP addresses and that line above in 
the env file to replace the dead node. I started the node and it started replacing the 
old node and was in the `UJ` state. Not long into the process, the new node stops 
processing data and the cluster forgets the new node and remembers the old one in its 
`DN` state (which is turned off, no power). There are no errors in the logs. I've tried 
different times hoping to solve the issue. I upped my ROOT logging level to DEBUG, I also 
set "org.apache.cassandra.gms.Gossiper TRACE". No errors.

With TRACE set for the Gossiper, I notice gossiping stops and data stopping 
streaming about the same time. I cannot run any nodetool commands on the new 
node. The process doesn't die, it leaves open connections to nodes that are 
streaming data, but I don't see any data streaming.

I've thought through a lot. Space isn't an issue, ulimits are set high in 
/etc/security/limits.conf. Checking /proc/<pid>/limits shows the values are 
high. I've replaced nodes before like this without issue, but this one is causing me 
grief. Is there anything more I can do?

Courtney

Re: Issue replacing a dead node

Reply via email to