What was the warning? Is it related to the disk failure policy? Could
you please share the relevant log? You can edit it and redact the
sensitive information before sharing it.
Also, I can't help to notice that you used the word "delete" (instead of
"clear") to describe the process of removing snapshots. May I ask how
did you delete the snapshots? Was it "nodetool clearsnapshot ...", "rm
-rf ..." or something else?
On 01/03/2021 11:27, Marco Gasparini wrote:
thanks Bowen for answering
Actually, I checked the server log and the only warning was that a
node went offline.
No, I have no backups or snapshots.
In the meantime I found that probably Cassandra moved all files from a
directory to the snapshot directory. I am pretty sure of that because
I have recently deleted all the snapshots I made because it was going
out of disk space and I found this very directory full of files where
the modification timestamp was the same as the first error I got in
the log.
Il giorno lun 1 mar 2021 alle ore 12:13 Bowen Song
<bo...@bso.ng.invalid> ha scritto:
The first thing I'd check is the server log. The log may contain
vital information about the cause of it, and that there may be
different ways to recover from it depending on the cause.
Also, please allow me to ask a seemingly obvious question, do you
have a backup?
On 01/03/2021 09:34, Marco Gasparini wrote:
hello everybody,
This morning, Monday!!!, I was checking on Cassandra cluster and
I noticed that all data was missing. I noticed the following
error on each node (9 nodes in the cluster):
*2021-03-01 09:05:52,984 WARN
[MessagingService-Incoming-/x.x.x.x]
IncomingTcpConnection.java:103 run UnknownColumnFamilyException
reading from socket; closing
org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't
find table for cfId cba90a70-5c46-11e9-9e36-f54fe3235e69. If a
table was just created, this is likely due to the schema not
being fully propagated. Please wait for schema agreement on
table creation.
at
org.apache.cassandra.config.CFMetaData$Serializer.deserialize(CFMetaData.java:1533)
at
org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:758)
at
org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:697)
at
org.apache.cassandra.io.ForwardingVersionedSerializer.deserialize(ForwardingVersionedSerializer.java:50)
at
org.apache.cassandra.net.MessageIn.read(MessageIn.java:123)
at
org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:195)
at
org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:183)
at
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:94)*
*
*
I tried to query the keyspace and got this:
node1# cqlsh
Connected to Cassandra Cluster at x.x.x.x:9042.
[cqlsh 5.0.1 | Cassandra 3.11.5.1 | CQL spec 3.4.4 | Native
protocol v4]
Use HELP for help.
cqlsh> select * from mykeyspace.mytable where id = 123935;
*InvalidRequest: Error from server: code=2200 [Invalid query]
message="Keyspace * *mykeyspace does not exist"*
*
*
Investigating on each node I found that all the *SStables exist*,
so I think data is still there but the keyspace vanished,
"magically".
Other facts I can tell you are:
* I have been getting Anticompaction errors from 2 nodes due to
the fact the disk was almost full.
* the cluster was online friday
* this morning, Monday, the whole cluster was offline and I
noticed the problem of "missing keyspace"
* During the weekend the cluster has been subject to inserts
and deletes
* I have a 9 node (HDD) Cassandra 3.11 cluster.
I really need help on this, how can I restore the cluster?
Thank you very much
Marco
*
*
*
*
*
*
*
*
*
*