Anthony P. Scism
Info Tech-Risk Mgmt/Client Sys - Capacity Planning
Work: 402-544-0361 Mobile: 402-707-4446

From:   "Durity, Sean R" <sean_r_dur...@homedepot.com>
To:     "user@cassandra.apache.org" <user@cassandra.apache.org>
Date:   09/19/2017 09:25 AM
Subject:        RE: Multi-node repair fails after upgrading to 3.0.14

This email originated from outside of the company. Please use discretion 
if opening attachments or clicking on links. 
Required maintenance for a cluster should not be this complicated and 
should not be changing so often. To me, this is a major flaw in Cassandra.
Sean Durity
From: Steinmaurer, Thomas [mailto:thomas.steinmau...@dynatrace.com] 
Sent: Tuesday, September 19, 2017 2:33 AM
To: user@cassandra.apache.org
Subject: RE: Multi-node repair fails after upgrading to 3.0.14
Hi Kurt,
thanks for the link!
Honestly, a pity, that in 3.0, we can’t get the simple, reliable and 
predictable way back to run a full repair for very low data volume CFs 
being kicked off on all nodes in parallel, without all the magic behind 
the scene introduced by incremental repairs, even if not used, as 
anticompaction even with –full has been introduced with 2.2+ J
From: kurt greaves [mailto:k...@instaclustr.com] 
Sent: Dienstag, 19. September 2017 06:24
To: User <user@cassandra.apache.org>
Subject: Re: Multi-node repair fails after upgrading to 3.0.14
https://issues.apache.org/jira/browse/CASSANDRA-13153 implies full repairs 
still triggers anti-compaction on non-repaired SSTables (if I'm reading 
that right), so might need to make sure you don't run multiple repairs at 
the same time across your nodes (if your using vnodes), otherwise could 
still end up trying to run anti-compaction on the same SSTable from 2 
Anyone else feel free to jump in and correct me if my interpretation is 
On 18 September 2017 at 17:11, Steinmaurer, Thomas <
thomas.steinmau...@dynatrace.com> wrote:
what should be the expected outcome when running with 3.0.14:
nodetool repair –full –pr keyspace cfs
·         Should –full trigger anti-compaction?
·         Should this be the same operation as nodetool repair –pr 
keyspace cfs in 2.1?
·         Should I be able to  run this on several nodes in parallel as in 
2.1 without troubles, where incremental repair was not the default?
Still confused if I’m missing something obvious. Sorry about that. J
From: Jeff Jirsa [mailto:jji...@gmail.com] 
Sent: Montag, 18. September 2017 16:10

To: user@cassandra.apache.org
Subject: Re: Multi-node repair fails after upgrading to 3.0.14
Sorry I may be wrong about the cause - didn't see -full
Mea culpa, its early here and I'm not awake

Jeff Jirsa

On Sep 18, 2017, at 7:01 AM, Steinmaurer, Thomas <
thomas.steinmau...@dynatrace.com> wrote:
Hi Jeff,
understood. That’s quite a change then coming from 2.1 from an operational 
Thanks again.
From: Jeff Jirsa [mailto:jji...@gmail.com] 
Sent: Montag, 18. September 2017 15:56
To: user@cassandra.apache.org
Subject: Re: Multi-node repair fails after upgrading to 3.0.14
The command you're running will cause anticompaction and the range borders 
for all instances at the same time
Since only one repair session can anticompact any given sstable, it's 
almost guaranteed to fail
Run it on one instance at a time

Jeff Jirsa

On Sep 18, 2017, at 1:11 AM, Steinmaurer, Thomas <
thomas.steinmau...@dynatrace.com> wrote:
Hi Alex,
I now ran nodetool repair –full –pr keyspace cfs on all nodes in parallel 
and this may pop up now: (progress: 1%)
[2017-09-18 07:59:17,145] Some repair failed
[2017-09-18 07:59:17,151] Repair command #3 finished in 0 seconds
error: Repair job has failed with the error message: [2017-09-18 
07:59:17,145] Some repair failed
-- StackTrace --
java.lang.RuntimeException: Repair job has failed with the error message: 
[2017-09-18 07:59:17,145] Some repair failed
2017-09-18 07:59:17 repair finished
If running the above nodetool call sequentially on all nodes, repair 
finishes without printing a stack trace.
The error message and stack trace isn’t really useful here. Any further 
From: Alexander Dejanovski [mailto:a...@thelastpickle.com] 
Sent: Freitag, 15. September 2017 11:30
To: user@cassandra.apache.org
Subject: Re: Multi-node repair fails after upgrading to 3.0.14
Right, you should indeed add the "--full" flag to perform full repairs, 
and you can then keep the "-pr" flag.
I'd advise to monitor the status of your SSTables as you'll probably end 
up with a pool of SSTables marked as repaired, and another pool marked as 
unrepaired which won't be compacted together (hence the suggestion of 
running subrange repairs).
Use sstablemetadata to check on the "Repaired at" value for each. 0 means 
unrepaired and any other value (a timestamp) means the SSTable has been 
I've had behaviors in the past where running "-pr" on the whole cluster 
would still not mark all SSTables as repaired, but I can't say if that 
behavior has changed in latest versions.
Having separate pools of SStables that cannot be compacted means that you 
might have tombstones that don't get evicted due to partitions living in 
both states (repaired/unrepaired).
To sum up the recommendations : 
- Run a full repair with both "--full" and "-pr" and check that SSTables 
are properly marked as repaired
- Use a tight repair schedule to avoid keeping partitions for too long in 
both repaired and unrepaired state
- Switch to subrange repair if you want to fully avoid marking SSTables as 
repaired (which you don't need anyway since you're not using incremental 
repairs). If you wish to do this, you'll have to mark back all your 
sstables to unrepaired, using nodetool sstablerepairedset.
On Fri, Sep 15, 2017 at 10:27 AM Steinmaurer, Thomas <
thomas.steinmau...@dynatrace.com> wrote:
Hi Alex,
thanks a lot. Somehow missed that incremental repairs are the default now.
We have been happy with full repair so far, cause data what we currently 
manually invoke for being prepared is a small (~1GB or even smaller).
So I guess with full repairs across all nodes, we still can stick with the 
partition range (-pr) option, but with 3.0 we additionally have to provide 
the –full option, right?
Thanks again,
From: Alexander Dejanovski [mailto:a...@thelastpickle.com] 
Sent: Freitag, 15. September 2017 09:45
To: user@cassandra.apache.org
Subject: Re: Multi-node repair fails after upgrading to 3.0.14
Hi Thomas,
in 2.1.18, the default repair mode was full repair while since 2.2 it is 
incremental repair.
So running "nodetool repair -pr" since your upgrade to 3.0.14 doesn't 
trigger the same operation.
Incremental repair cannot run on more than one node at a time on a 
cluster, because you risk to have conflicts with sessions trying to 
anticompact and run validation compactions on the same SSTables (which 
will make the validation phase fail, like your logs are showing).
Furthermore, you should never use "-pr" with incremental repair because it 
is useless in that mode, and won't properly perform anticompaction on all 
If you were happy with full repairs in 2.1.18, I'd suggest to stick with 
those in 3.0.14 as well because there are still too many caveats with 
incremental repairs that should hopefully be fixed in 4.0+.
Note that full repair will also trigger anticompaction and mark SSTables 
as repaired in your release of Cassandra, and only full subrange repairs 
are the only flavor that will skip anticompaction. 
You will need some tooling to help with subrange repairs though, and I'd 
recommend to use Reaper which handles automation for you : 
If you decide to stick with incremental repairs, first perform a rolling 
restart of your cluster to make sure no repair session still runs, and run 
"nodetool repair" on a single node at a time. Move on to the next node 
only when nodetool or the logs show that repair is over (which will 
include the anticompaction phase).
On Fri, Sep 15, 2017 at 8:42 AM Steinmaurer, Thomas <
thomas.steinmau...@dynatrace.com> wrote:
we are currently in the process of upgrading from 2.1.18 to 3.0.14. After 
upgrading a few test environments, we start to see some suspicious log 
entries regarding repair issues.
We have a cron job on all nodes basically executing the following repair 
call on a daily basis:
nodetool repair –pr <list of CFs>
This gets started on all nodes at the same time. While this has worked 
with 2.1.18 (at least we haven’t seen anything suspicious in Cassandra 
log), with 3.0.14 we get something similar like that on all nodes (see 
below; IP addresses and KS/CF faked).
Any pointers are appreciated. Thanks.
INFO  [Thread-2941] 2017-09-15 03:00:28,036 RepairSession.java:224 - 
[repair #071f81e0-99c2-11e7-91dc-6132f5fe5fb0] new session: will sync 
/FAKE.33.64, /FAKE.35.153, /FAKE.34.171 on range 
(402227699082311580,407458511300218383]] for XXX.[YYY, ZZZ]
INFO  [Repair#1:1] 2017-09-15 03:00:28,419 RepairJob.java:172 - [repair 
#071f81e0-99c2-11e7-91dc-6132f5fe5fb0] Requesting merkle trees for YYY (to 
[/FAKE.35.153, /FAKE.34.171, /FAKE.33.64])
INFO  [Thread-2941] 2017-09-15 03:00:28,434 RepairSession.java:224 - 
[repair #075d2720-99c2-11e7-91dc-6132f5fe5fb0] new session: will sync 
/FAKE.33.64, /FAKE.35.57, /FAKE.34.171 on range 
(6990922382177634221,7007948980474566617]] for XXX.[YYY, ZZZ]
INFO  [Thread-2941] 2017-09-15 03:00:28,778 RepairSession.java:224 - 
[repair #0791a4a0-99c2-11e7-91dc-6132f5fe5fb0] new session: will sync 
/FAKE.33.64, /FAKE.35.57, /FAKE.34.90 on range 
(7186200312153798494,7187207897161667549]] for XXX.[YYY, ZZZ]
INFO  [Thread-2941] 2017-09-15 03:00:28,942 RepairSession.java:224 - 
[repair #07aaaae0-99c2-11e7-91dc-6132f5fe5fb0] new session: will sync 
/FAKE.33.64, /FAKE.35.153, /FAKE.34.90 on range 
(2018021481024026660,2049270980733207626]] for XXX.[YYY, ZZZ]
ERROR [ValidationExecutor:3] 2017-09-15 03:00:29,471 
ActiveRepairService.java:554 - Cannot start multiple repair sessions over 
the same sstables
ERROR [ValidationExecutor:3] 2017-09-15 03:00:29,471 Validator.java:268 - 
Failed creating a merkle tree for [repair 
#071f81e0-99c2-11e7-91dc-6132f5fe5fb0 on XXX/YYY, 
(402227699082311580,407458511300218383]]], /FAKE.33.64 (see log for 
INFO  [AntiEntropyStage:1] 2017-09-15 03:00:29,473 RepairSession.java:176 
- [repair #071f81e0-99c2-11e7-91dc-6132f5fe5fb0] Received merkle tree for 
YYY from /FAKE.33.64
ERROR [Repair#1:1] 2017-09-15 03:00:29,492 CassandraDaemon.java:207 - 
Exception in thread Thread[Repair#1:1,5,RMI Runtime]
org.apache.cassandra.exceptions.RepairException: [repair 
#071f81e0-99c2-11e7-91dc-6132f5fe5fb0 on XXX/YYY, 
(402227699082311580,407458511300218383]]] Validation failed in /FAKE.33.64
        at org.apache.cassandra.repair.RepairJob.run(RepairJob.java:160) 
        at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_102]
Caused by: org.apache.cassandra.exceptions.RepairException: [repair 
#071f81e0-99c2-11e7-91dc-6132f5fe5fb0 on XXX/YYY, 
(402227699082311580,407458511300218383]]] Validation failed in /FAKE.33.64
        at org.apache.cassandra.net
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
        ... 4 common frames omitted
WARN  [RepairJobTask:2] 2017-09-15 03:00:29,493 RepairJob.java:153 - 
[repair #071f81e0-99c2-11e7-91dc-6132f5fe5fb0] YYY sync failed
ERROR [RepairJobTask:2] 2017-09-15 03:00:29,498 RepairSession.java:277 - 
[repair #071f81e0-99c2-11e7-91dc-6132f5fe5fb0] Session completed with the 
following error
org.apache.cassandra.exceptions.RepairException: [repair 
#071f81e0-99c2-11e7-91dc-6132f5fe5fb0 on XXX/YYY, 
(402227699082311580,407458511300218383]]] Validation failed in /FAKE.33.64
        at org.apache.cassandra.net
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
        at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_102]
The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or 
disclose it to anyone else. If you received it in error please notify us 
immediately and then destroy it. Dynatrace Austria GmbH (registration 
number FN 91482h) is a company registered in Linz whose registered office 
is at 4040 Linz, Austria, Freistädterstraße 313 
Alexander Dejanovski
Apache Cassandra Consulting
The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or 
disclose it to anyone else. If you received it in error please notify us 
immediately and then destroy it. Dynatrace Austria GmbH (registration 
number FN 91482h) is a company registered in Linz whose registered office 
is at 4040 Linz, Austria, Freistädterstraße 313 
Alexander Dejanovski
Apache Cassandra Consulting
The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or 
disclose it to anyone else. If you received it in error please notify us 
immediately and then destroy it. Dynatrace Austria GmbH (registration 
number FN 91482h) is a company registered in Linz whose registered office 
is at 4040 Linz, Austria, Freistädterstraße 313 
The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or 
disclose it to anyone else. If you received it in error please notify us 
immediately and then destroy it. Dynatrace Austria GmbH (registration 
number FN 91482h) is a company registered in Linz whose registered office 
is at 4040 Linz, Austria, Freistädterstraße 313 
The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or 
disclose it to anyone else. If you received it in error please notify us 
immediately and then destroy it. Dynatrace Austria GmbH (registration 
number FN 91482h) is a company registered in Linz whose registered office 
is at 4040 Linz, Austria, Freistädterstraße 313 
The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or 
disclose it to anyone else. If you received it in error please notify us 
immediately and then destroy it. Dynatrace Austria GmbH (registration 
number FN 91482h) is a company registered in Linz whose registered office 
is at 4040 Linz, Austria, Freistädterstraße 313 

The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email 
by anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be 
taken in reliance on it, is prohibited and may be unlawful. When addressed 
to our clients any opinions or advice contained in this Email are subject 
to the terms and conditions expressed in any applicable governing The Home 
Depot terms of business or client engagement letter. The Home Depot 
disclaims all responsibility and liability for the accuracy and content of 
this attachment and for any damages or losses arising from any 
inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other 
items of a destructive nature, which may be contained in this attachment 
and shall not be liable for direct, indirect, consequential or special 
damages in connection with this e-mail message or its attachment.


This email and any attachments may contain information that is confidential 
and/or privileged for the sole use of the intended recipient.  Any use, review, 
disclosure, copying, distribution or reliance by others, and any forwarding of 
this email or its contents, without the express permission of the sender is 
strictly prohibited by law.  If you are not the intended recipient, please 
contact the sender immediately, delete the e-mail and destroy all copies.

Reply via email to