Kinda. It isn’t that you have to repair twice per se, just that the possibility 
of running repairs at least twice before GC grace seconds elapse means that 
clearly there is no chance of a tombstone not being subject to repair at least 
once before you hit your GC grace seconds.

Imagine a tombstone being created on the very first node that Reaper looked at 
in a repair cycle, but one second after Reaper completed repair of that 
particular token range.  Repairs will be complete, but that particular 
tombstone just missed being part of the effort.

Now your next repair run happens.  What if Reaper doesn’t look at that same 
node first?  It is easy to have happen, as there is a bunch of logic related to 
detection of existing repairs or things taking too long.  So the box that was 
“the first node” in that first repair run, through bad luck gets kicked down to 
later in the second run.  I’ve seen nodes get skipped multiple times (you can 
tune to reduce that, but bottom line… it happens).  So, bad luck you’ve got.  
Eventually the node does get repaired, and the aging tombstone finally gets 
removed.  All fine and dandy…

Provided that the second repair run got to that point BEFORE you hit your GC 
grace seconds.

That’s why you need enough time to run it twice.  Because you need enough time 
to catch the oldest possible tombstone, even if it is dealt with at the very 
end of a repair run.  Yes, it sounds like a bit of a degenerate case, but if 
you are writing a lot of data, the probability of not having the degenerate 
cases become real cases becomes vanishingly small.

R


From: Sergio <lapostadiser...@gmail.com>
Date: Wednesday, January 22, 2020 at 1:41 PM
To: "user@cassandra.apache.org" <user@cassandra.apache.org>, Reid Pinchback 
<rpinchb...@tripadvisor.com>
Subject: Re: Is there any concern about increasing gc_grace_seconds from 5 days 
to 8 days?

Message from External Sender
I was wondering if I should always complete 2 repairs cycles with reaper even 
if one repair cycle finishes in 7 hours.

Currently, I have around 200GB in column family data size to be repaired and I 
was scheduling once repair a week and I was not having too much stress on my 8 
nodes cluster with i3xlarge nodes.

Thanks,

Sergio

Il giorno mer 22 gen 2020 alle ore 08:28 Sergio 
<lapostadiser...@gmail.com<mailto:lapostadiser...@gmail.com>> ha scritto:
Thank you very much! Yes I am using reaper!

Best,

Sergio

On Wed, Jan 22, 2020, 8:00 AM Reid Pinchback 
<rpinchb...@tripadvisor.com<mailto:rpinchb...@tripadvisor.com>> wrote:
Sergio, if you’re looking for a new frequency for your repairs because of the 
change, if you are using reaper, then I’d go for repair_freq <= gc_grace / 2.

Just serendipity with a conversation I was having at work this morning.  When 
you actually watch the reaper logs then you can see situations where unlucky 
timing with skipped nodes can make the time to remove a tombstone be up to 2 x 
repair_run_time.

If you aren’t using reaper, your mileage will vary, particularly if your 
repairs are consistent in the ordering across nodes.  Reaper can be moderately 
non-deterministic hence the need to be sure you can complete at least two 
repair runs.

R

From: Sergio <lapostadiser...@gmail.com<mailto:lapostadiser...@gmail.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Tuesday, January 21, 2020 at 7:13 PM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: Is there any concern about increasing gc_grace_seconds from 5 days 
to 8 days?

Message from External Sender
Thank you very much for your response.
The considerations mentioned are the ones that I was expecting.
I believe that I am good to go.
I just wanted to make sure that there was no need to run any other extra 
command beside that one.

Best,

Sergio

On Tue, Jan 21, 2020, 3:55 PM Jeff Jirsa 
<jji...@gmail.com<mailto:jji...@gmail.com>> wrote:
Note that if you're actually running repairs within 5 days, and you adjust this 
to 8, you may stream a bunch of tombstones across in that 5-8 day window, which 
can increase disk usage / compaction (because as you pass 5 days, one replica 
may gc away the tombstones, the others may not because the tombstones shadow 
data, so you'll re-stream the tombstone to the other replicas)

On Tue, Jan 21, 2020 at 3:28 PM Elliott Sims 
<elli...@backblaze.com<mailto:elli...@backblaze.com>> wrote:
In addition to extra space, queries can potentially be more expensive because 
more dead rows and tombstones will need to be scanned.  How much of a 
difference this makes will depend drastically on the schema and access pattern, 
but I wouldn't expect going from 5 days to 8 to be very noticeable.

On Tue, Jan 21, 2020 at 2:14 PM Sergio 
<lapostadiser...@gmail.com<mailto:lapostadiser...@gmail.com>> wrote:
https://stackoverflow.com/a/22030790<https://urldefense.proofpoint.com/v2/url?u=https-3A__stackoverflow.com_a_22030790&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=qt1NAYTks84VVQ4WGXWkK6pw85m3FcuUjPRJPdIHMdw&s=aEgz5F5HRxPT3w4hpfNXQRhcchwRjrpf7KB3QyywO_Q&e=>

For CQLSH

alter table <table_name> with GC_GRACE_SECONDS = <seconds>;



Il giorno mar 21 gen 2020 alle ore 13:12 Sergio 
<lapostadiser...@gmail.com<mailto:lapostadiser...@gmail.com>> ha scritto:
Hi guys!

I just wanted to confirm with you before doing such an operation. I expect to 
increase the space but nothing more than this. I  need to perform just :

UPDATE COLUMN FAMILY cf with GC_GRACE = 691,200; //8 days
Is it correct?

Thanks,

Sergio

Reply via email to