Hi All,

I have single node cluster I use for development on my local machine. After
apt package upgrades and hard reboots the node takes a very long time to
restart.

The node will always eventually come back up however it takes ages
sometimes. It seems to be CPU bound as all 4 cores are maxed out by
Cassandra. The disk IO is relativity tiny (less than 1 MB/s) considering
its running on an SSD.

At the logs start-up once took over 6 hours once. From a development point
of view its not the end of the world but should I suffer a Data Centre
outage in production this could massively delay the time to come back
on-line.

I suspect the workload might be causing it. There's 16 gig of data actually
stored in it. However one of the tables holds a message queue. Which may
well have a few hundred thousand tombstones and up to 500Kb per record.  Is
this likely to have an impact on start up time? Is there anything I can do
to mitigate it. The queries on this are fast because it knows where to
start so using the table is not an issue.

Any other suggestions to look at?

Thanks,

Charlie M

Reply via email to