Re: Cassandra 2.1.18 - Concurrent nodetool repair resulting in > 30K SSTables for a single small (GBytes) CF

2018-03-06 Thread kurt greaves
>
>  What we did have was some sort of overlapping between our daily repair
> cronjob and the newly added node still in progress joining. Don’t know if
> this sort of combination might causing troubles.

I wouldn't be surprised if this caused problems. Probably want to avoid
that.

with waiting a few minutes after each finished execution and every time I
> see “… out of sync …” log messages in context of the repair, so it looks
> like, that each repair execution is detecting inconsistencies. Does this
> make sense?

Well it doesn't, but there have been issues in the past that caused exactly
this problem. I was under the impression they were all fixed by 2.1.18
though.

Additionally, we are writing at CL ANY, reading at ONE and repair chance
> for the 2 CFs in question is default 0.1

Have you considered writing at least at CL [LOCAL_]ONE? At the very least
it would rule out if there's a problem with hints.
​


RE: Cassandra 2.1.18 - Concurrent nodetool repair resulting in > 30K SSTables for a single small (GBytes) CF

2018-03-06 Thread Steinmaurer, Thomas
Hi Kurt,

our provisioning layer allows extending a cluster only one-by-one, thus we 
didn’t add multiple nodes at the same time.

What we did have was some sort of overlapping between our daily repair cronjob 
and the newly added node still in progress joining. Don’t know if this sort of 
combination might causing troubles.

I did some further testing and run on the same node the following repair call.

nodetool repair -pr ks cf1 cf2

with waiting a few minutes after each finished execution and every time I see 
“… out of sync …” log messages in context of the repair, so it looks like, that 
each repair execution is detecting inconsistencies. Does this make sense?

As said: We are using vnodes (256), RF=3. Additionally, we are writing at CL 
ANY, reading at ONE and repair chance for the 2 CFs in question is default 0.1

Currently testing a few consecutives executions without -pr on the same node.

Thanks,
Thomas

From: kurt greaves [mailto:k...@instaclustr.com]
Sent: Montag, 05. März 2018 01:10
To: User <user@cassandra.apache.org>
Subject: Re: Cassandra 2.1.18 - Concurrent nodetool repair resulting in > 30K 
SSTables for a single small (GBytes) CF

Repairs with vnodes is likely to cause a lot of small SSTables if you have 
inconsistencies (at least 1 per vnode). Did you have any issues when adding 
nodes, or did you add multiple nodes at a time? Anything that could have lead 
to a bit of inconsistency could have been the cause.

I'd probably avoid running the repairs across all the nodes simultaneously and 
instead spread them out over a week. That likely made it worse. Also worth 
noting that in versions 3.0+ you won't be able to run nodetool repair in such a 
way because anti-compaction will be triggered which will fail if multiple 
anti-compactions are attempted simultaneously (if you run multiple repairs 
simultaneously).

Have a look at orchestrating your repairs with TLP's fork of 
cassandra-reaper<http://cassandra-reaper.io/>.
​
The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a 
company registered in Linz whose registered office is at 4040 Linz, Austria, 
Freistädterstraße 313


Re: Cassandra 2.1.18 - Concurrent nodetool repair resulting in > 30K SSTables for a single small (GBytes) CF

2018-03-04 Thread kurt greaves
Repairs with vnodes is likely to cause a lot of small SSTables if you have
inconsistencies (at least 1 per vnode). Did you have any issues when adding
nodes, or did you add multiple nodes at a time? Anything that could have
lead to a bit of inconsistency could have been the cause.

I'd probably avoid running the repairs across all the nodes simultaneously
and instead spread them out over a week. That likely made it worse. Also
worth noting that in versions 3.0+ you won't be able to run nodetool repair
in such a way because anti-compaction will be triggered which will fail if
multiple anti-compactions are attempted simultaneously (if you run multiple
repairs simultaneously).

Have a look at orchestrating your repairs with TLP's fork of
cassandra-reaper .
​


Cassandra 2.1.18 - Concurrent nodetool repair resulting in > 30K SSTables for a single small (GBytes) CF

2018-03-01 Thread Steinmaurer, Thomas
Hello,

Production, 9 node cluster with Cassandra 2.1.18, vnodes, default 256 tokens, 
RF=3, compaction throttling = 16, concurrent compactors = 4, running in AWS 
using m4.xlarge at ~ 35% CPU AVG

We have a nightly cronjob starting a "nodetool repair -pr ks cf1 cf2" 
concurrently on all nodes, where data volume for cf1 and cf2 is ~ 1-5GB in 
size, so pretty small.

After extending the cluster from 6 to the current 9 nodes and "nodetool 
cleanup" being finished, the above repair is resulting in > 30K SSTables for 
these two CFs on several nodes with very, very tiny files < 1Kbytes , but not 
on all nodes. Obviously, this affects read latency + disk IO + CPU a lot and it 
needs several hours until the situation relaxes. We have other clusters with 
the same spec which also have been extended from 6 to 9 nodes in the past, 
where we don't see this issue. For now, we have disabled the nightly cron job.

Any input on how to trouble-shoot this issue about the root cause?

Thanks,
Thomas

The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a 
company registered in Linz whose registered office is at 4040 Linz, Austria, 
Freist?dterstra?e 313