Re: MISSING keyspace

2021-03-01 Thread Marco Gasparini
hi @Erick, Actually this timestamp *1614575293790 *is equivalent to *GMT: Monday, 1 March 2021 05:08:13.790* that stands for *GMT+1: Monday, 1 March 2021 06:08:13.790* (my local timezone). This is consistent with the other logs time in the cluster. Thank

Re: MISSING keyspace

2021-03-01 Thread Erick Ramirez
The timestamp (1614575293790) in the snapshot directory name is equivalent to 1 March 16:08 GMT: actually I found a lot of .db files in the following directory: > > /var/lib/cassandra/data/mykespace/mytable-2795c0204a2d11e9aba361828766468f/snapshots/dropped-1614575293790- > mytable > which lines

Re: Cassandra on arm aws instances

2021-03-01 Thread Erick Ramirez
> > it's not the same, notice I wrote r6gd, these are the ones with nvme, i'm > looking just at those. > I'm aware. I did use r6gd.2xlarge in my example. :) > I do not need all the space that i3en gives me (and probably won't be able > to use it all due to memory usage, or have other issues

Re: MISSING keyspace

2021-03-01 Thread Marco Gasparini
I haven't made any schema modifications for a year or more. This problem came up during a "normal day of work" for Cassandra. Il giorno lun 1 mar 2021 alle ore 16:25 Bowen Song ha scritto: > Your missing keyspace problem has nothing to do with that bug. > > In that case, the same table was

Re: MISSING keyspace

2021-03-01 Thread Bowen Song
Your missing keyspace problem has nothing to do with that bug. In that case, the same table was created twice in a very short period of time, and I suspect that was done concurrently on two different nodes. The evidence lies in the two CF IDs - bd7200a0156711e88974855d74ee356f and

Re: Cassandra on arm aws instances

2021-03-01 Thread Gil Ganz
it's not the same, notice I wrote r6gd, these are the ones with nvme, i'm looking just at those. I do not need all the space that i3en gives me (and probably won't be able to use it all due to memory usage, or have other issues just like you mention), so the plan is use the big enough r6gd nodes,

Re: MISSING keyspace

2021-03-01 Thread Marco Gasparini
actually I found a lot of .db files in the following directory: /var/lib/cassandra/data/mykespace/mytable-2795c0204a2d11e9aba361828766468f/snapshots/dropped-1614575293790- mytable I also found this: 2021-03-01 06:08:08,864 INFO [Native-Transport-Requests-1]

Re: MISSING keyspace

2021-03-01 Thread Bowen Song
The warning message indicates the node y.y.y.y went down (or is unreachable via network) before 2021-02-28 05:17:33. Is there any chance you can find the log file on that node at around or before that time? It may show why did that node go down. The reason of that might be irrelevant to the

Re: Cassandra on arm aws instances

2021-03-01 Thread Erick Ramirez
The instance types you refer to are contradictory so I'm not really sure if this is really about Arm-based servers. The i3en-vs-r6 is not an apples-for-apples comparison. The R6g type is EBS-only so they will perform significantly worse than i3 instances. R6gd come with NVMe SSDs but they are

Re: Recovery after server crash 4.0b3

2021-03-01 Thread David Tinker
Thanks guys. The IP address hasn't changed so I will go ahead and start the server and repair. On Mon, Mar 1, 2021 at 1:50 PM Erick Ramirez wrote: > If the node's only been down for less than gc_grace_seconds and the data > in the drives are intact, you should be fine just booting the server

Re: MISSING keyspace

2021-03-01 Thread Erick Ramirez
As the warning message suggests, you need to check for schema disagreement. My suspicion is that someone made a schema change and possibly dropped the problematic keyspace. FWIW I suspect the keyspace was dropped because the table isn't new -- CF ID cba90a70-5c46-11e9-9e36-f54fe3235e69 is

Re: Recovery after server crash 4.0b3

2021-03-01 Thread Erick Ramirez
If the node's only been down for less than gc_grace_seconds and the data in the drives are intact, you should be fine just booting the server and it will join the cluster. You will need to run a repair so it picks up the missed mutations. @Bowen FWIW no need to do a "replace" -- the node will

Re: MISSING keyspace

2021-03-01 Thread Marco Gasparini
here the previous error: 2021-02-28 05:17:33,262 WARN NodeConnectionsService.java:165 validateAndConnectIfNeeded failed to connect to node {y.y.y.y}{9ba2d3ee-bc82-4e76-ae24-9e20eb334c24}{9ba2d3ee-bc82-4e76-ae24-9e20eb334c24}{ y.y.y.y }{ y.y.y.y :9300}{ALIVE}{rack=r1, dc=DC1} (tried [1] times)

Re: MISSING keyspace

2021-03-01 Thread Bowen Song
What was the warning? Is it related to the disk failure policy? Could you please share the relevant log? You can edit it and redact the sensitive information before sharing it. Also, I can't help to notice that you used the word "delete" (instead of "clear") to describe the process of

Re: MISSING keyspace

2021-03-01 Thread Marco Gasparini
thanks Bowen for answering Actually, I checked the server log and the only warning was that a node went offline. No, I have no backups or snapshots. In the meantime I found that probably Cassandra moved all files from a directory to the snapshot directory. I am pretty sure of that because I have

Re: Recovery after server crash 4.0b3

2021-03-01 Thread Bowen Song
Has the IP address changed? If the IP address hasn't changed and the data is still on disk, you should be able to start this node and it will become available again. Note: you may need to repair this node after that. However, if the IP address has changed as the result of replacing the

Re: MISSING keyspace

2021-03-01 Thread Bowen Song
The first thing I'd check is the server log. The log may contain vital information about the cause of it, and that there may be different ways to recover from it depending on the cause. Also, please allow me to ask a seemingly obvious question, do you have a backup? On 01/03/2021 09:34,

Recovery after server crash 4.0b3

2021-03-01 Thread David Tinker
Hi Guys I have a 3 node cluster running 4.0b3 with all data replicated to all 3 nodes. This morning one of the servers started randomly rebooting (up for a minute or two then reboot) for a couple of hours. The cluster continued running normally during this time (nice!). My hosting company has

MISSING keyspace

2021-03-01 Thread Marco Gasparini
hello everybody, This morning, Monday!!!, I was checking on Cassandra cluster and I noticed that all data was missing. I noticed the following error on each node (9 nodes in the cluster): *2021-03-01 09:05:52,984 WARN [MessagingService-Incoming-/x.x.x.x] IncomingTcpConnection.java:103

Re: Impact analysis of upgrading RHEL/SLES OS

2021-03-01 Thread Erick Ramirez
In most cases, minor OS upgrades are not problematic provided you have sufficient capacity in your cluster so that it can tolerate scheduled downtime while some nodes are being upgraded. One thing you should be aware of are patches in newer versions of Linux distributions which address Spectre