Hi All,

Apologies for the long email, so TL;DR:
A node bootstrapped successfully but only got the data which it was the
owner of, and didn't get the data as a replica.

I'm experiencing a really odd situation during node bootstrap.

Cassandra 3.11.4.
Background:
Due to a capacity issue on one of our clusters, we decided to increase the
storage space on each node. Since the nodes are set with RAID 0, the only
valid option was to decommission the node, add the disks and rebuild, and
once done, bootstrap it back to the cluster.

Everything was going well until one of our system guys accidentally handled
the disks on an active server instead of a decommissioned one. The node
immediately crashed since the data directory was destroyed.
I had to forcefully remove it from the cluster, but when I added it back,
it only got 1/3 of the data it actually had to get.
I've tried several times, even after a rolling restart of the whole
cluster, and always got the same result. I've tried to repair the node
using -pr but it practically didn't repair anything. I've tried to fully
repair it (without -pr) and the repair did fix the problem.
So I concluded that the node only got the data which it was the owner of,
but didn't get the data as a replica. I got the same outcome on both the
node that crashed and the node that was gracefully decommissioned.

Any idea why it's acting like this?

Thanks!

Reply via email to