One last update:
After kicking it more, it finally fully joined the cluster. The third
time the server was rebooted and after that it eventually reached the UN
state. I wish I had kept the link, but I had read that someone had a
similar issue joining a node to a cluster with 4.1.x and the answer was
to restart the service.
On 5/23/25 10:46 PM, Courtney wrote:
Some updates after getting back to this. I did hardware tests and
could not find any hardware issues. Instead of trying a replace, I
went the route of removing the dead node entirely and then adding in a
new node.
The new node is still joining, but I am hitting some oddities in the
log. When joining, the process seems to "rescheduling" around the time
that these logs appear:
INFO [Messaging-EventLoop-3-25] 2025-05-24 05:20:38,203
NoSpamLogger.java:105 -
/<new-node>:7000->/<dead-node>:7000-SMALL_MESSAGES-[no-channel] failed
to connect
io.netty.channel.ConnectTimeoutException: connection timed out:
/<dead-node>:7000
WARN [Messaging-EventLoop-3-25] 2025-05-24 05:20:46,200
NoSpamLogger.java:108 -
/<new-node>:7000->/<dead-node>:7000-SMALL_MESSAGES-[no-channel]
dropping message of type PING_REQ whose timeout expired before
reaching the network
INFO [Messaging-EventLoop-3-26] 2025-05-24 05:32:27,606
NoSpamLogger.java:105 -
/<new-node>:7000->/<dead-node>:7000-LARGE_MESSAGES-[no-channel] failed
to connect
io.netty.channel.ConnectTimeoutException: connection timed out:
/<dead-node>:7000
INFO [GossipStage:1] 2025-05-24 05:33:15,538 Gossiper.java:1428 -
InetAddress /<dead-node>:7000 is now DOWN
The cluster knows 0 about the old node, the old node is completely
powered off. I don't know why it is attempting to connect to the dead
node that has no presence in the cluster.
Beforehand, I start to get these messages:
INFO [OptionalTasks:1] 2025-05-24 05:27:46,189 NoSpamLogger.java:105
- "Cannot read from a bootstrapping node" while executing SELECT *
FROM system_auth.roles WHERE role = 'cassandra' ALLOW FILTERING
WARN [OptionalTasks:1] 2025-05-24 05:32:16,240
CassandraRoleManager.java:359 - CassandraRoleManager skipped default
role setup: some nodes were not ready
INFO [OptionalTasks:1] 2025-05-24 05:32:16,240
CassandraRoleManager.java:395 - Setup task failed with error,
rescheduling
I've had to stop and start cassandra to get passed this, but I am
afraid I will hit this again soon.
On 5/16/25 11:54 PM, Sebastian Marsching wrote:
To add on to what Bowen already wrote, if you cannot find any reason
in the logs at all, I would retry using different hardware.
In the recent past I have seen two cases where strange Cassandra
problems were actually caused by broken hardware (in both cases, a
faulty memory module caused the issues). In one case, there were log
messages, but misleading ones about SSTable corruption (the SSTables
were fine, but when loaded into memory, the data got corrupted). In
the other case, there were no log messages at all. The Cassandra
process simply stopped without a good reason. Eventually I found the
crash dumps in the Cassandra data directory (whether they are written
depends on the JVM setting), which indicated that the JVM experienced
a segmentation fault.
So, if you have any spare hardware lying around (it’s best if it is
from a different batch to exclude common-mode failures), using a
different piece of hardware and trying with that one might make
sense. If the problem stays, you can at least be sure that it isn’t
related to the hardware, and if it vanishes, you can further inspect
the original hardware.
Am 17.05.2025 um 04:27 schrieb Bowen Song via user
<user@cassandra.apache.org>:
In my experience, failed bootstrap / node replacement always leave
some traces in the logs. At the very minimal, there's going to be
logs about streaming sessions failing or aborting. I have never seen
it silently fails or stops without leaving any traces in the log. I
can't think of anything that can cause the process to fail and
doesn't leave a trace in the log. BTW, the relevant logs can be
hours before the symptom becomes visible, because a failed streaming
session does not cause Cassandra to immediately abort other active
streaming sessions, and the remaining active sessions can take a
while to complete.
If the process repeatedly fails at a certain place, I would suspect
some sort of data corruption or disk error, resulting in the data
cannot be read or deserialised correctly. But this is just a guess,
and I could be wrong.
On 16/05/2025 01:14, Courtney wrote:
I checked all the logs and really couldn't find anything. I
couldn't find any sort of errors in dmesg, system.log, debug.log,
gc.log (maybe up the log level?), systemd journal...the logs are
totally clean. It just stops gossiping all of a sudden at 22GB of
data each time, then the old node returning to DN state. What is
`nodetool bootstrap resume` going to do? Is there a risk to running
resume when the replacement node is no longer in the cluster? Could
too high of a tombstone ratio cause this?
On 5/15/25 5:08 PM, Bowen Song via user wrote:
The dead node being replaced went back to DN state indicating the
new replacement node failed to join the cluster, usually because
the streaming was interrupted (e.g. by network issues, or long STW
GC pauses). I would start looking for red flags in the logs,
including Cassandra's logs, GC logs, dmesg, systemd journal, etc.,
on the new node, and other nodes in the cluster too. Also, I would
try `nodetool bootstrap resume` on the replacement node.
On 12/05/2025 09:53, Courtney wrote:
Hello everyone,
I have a cluster with 2 datacenters. I am using
GossipingPropertyFileSnitch as my endpoint snitch. Cassandra
version 4.1.8. One datacenter is fully Ubuntu 24.04 and OpenJDK
11 and another is Ubuntu 20.04 on OpenJDK 8. A seed node died in
my second DC running Ubuntu 20.04 hosts. I ordered a new
dedicated server. I updated my seeds to forget the dead seed
node. I did the steps to replace a dead node
JVM_OPTS="$JVM_OPTS $JVM_EXTRA_OPTS
-Dcassandra.replace_address_first_boot=<dead_node_ip>"
Configs between the old/new node are identical minus IP addresses
and that line above in the env file to replace the dead node. I
started the node and it started replacing the old node and was in
the `UJ` state. Not long into the process, the new node stops
processing data and the cluster forgets the new node and
remembers the old one in its `DN` state (which is turned off, no
power). There are no errors in the logs. I've tried different
times hoping to solve the issue. I upped my ROOT logging level to
DEBUG, I also set "org.apache.cassandra.gms.Gossiper TRACE". No
errors.
With TRACE set for the Gossiper, I notice gossiping stops and
data stopping streaming about the same time. I cannot run any
nodetool commands on the new node. The process doesn't die, it
leaves open connections to nodes that are streaming data, but I
don't see any data streaming.
I've thought through a lot. Space isn't an issue, ulimits are set
high in /etc/security/limits.conf. Checking /proc/<pid>/limits
shows the values are high. I've replaced nodes before like this
without issue, but this one is causing me grief. Is there
anything more I can do?
Courtney