Re: Partition range incremental repairs

Jonathan Haddad Tue, 06 Jun 2017 12:03:44 -0700

I can't recommend *anyone* use incremental repair as there's some pretty
horrible bugs in it that can cause Merkle trees to wildly mismatch & result
in massive overstreaming.  Check out
https://issues.apache.org/jira/browse/CASSANDRA-9143.


TL;DR: Do not use incremental repair before 4.0.

On Tue, Jun 6, 2017 at 9:54 AM Anuj Wadehra <anujw_2...@yahoo.co.in.invalid>
wrote:

> Hi Chris,
>
> Can your share following info:
>
> 1. Exact repair commands you use for inc repair and pr repair
>
> 2. Repair time should be measured at cluster level for inc repair. So,
> whats the total time it takes to run repair on all nodes for incremental vs
> pr repairs?
>
> 3. You are repairing one dc DC3. How many DCs are there in total and whats
> the RF for keyspaces? Running pr on a specific dc would not repair entire
> data.
>
> 4. 885 ranges? From where did you get this number? Logs? Can you share the
> number ranges printed in logs for both inc and pr case?
>
>
> Thanks
> Anuj
>
>
> Sent from Yahoo Mail on Android
> <https://overview.mail.yahoo.com/mobile/?.src=Android>
>
> On Tue, Jun 6, 2017 at 9:33 PM, Chris Stokesmore
>
> <chris.elsm...@demandlogic.co> wrote:
> Thank you for the excellent and clear description of the different
> versions of repair Anuj, that has cleared up what I expect to be happening.
>
> The problem now is in our cluster, we are running repairs with options
> (parallelism: parallel, primary range: false, incremental: true, job
> threads: 1, ColumnFamilies: [], dataCenters: [DC3], hosts: [], # of ranges:
> 885) and when we do our repairs are taking over a day to complete when
> previously when running with the partition range option they were taking
> more like 8-9 hours.
>
> As I understand it, using incremental should have sped this process up as
> all three sets of data on each repair job should be marked as repaired
> however this does not seem to be the case. Any ideas?
>
> Chris
>
> On 6 Jun 2017, at 16:08, Anuj Wadehra <anujw_2...@yahoo.co.in.INVALID>
> wrote:
>
> Hi Chris,
>
> Using pr with incremental repairs does not make sense. Primary range
> repair is an optimization over full repair. If you run full repair on a n
> node cluster with RF=3, you would be repairing each data thrice.
> E.g. in a 5 node cluster with RF=3, a range may exist on node A,B and C .
> When full repair is run on node A, the entire data in that range gets
> synced with replicas on node B and C. Now, when you run full repair on
> nodes B and C, you are wasting resources on repairing data which is already
> repaired.
>
> Primary range repair ensures that when you run repair on a node, it ONLY
> repairs the data which is owned by the node. Thus, no node repairs data
> which is not owned by it and must be repaired by other node. Redundant work
> is eliminated.
>
> Even in pr, each time you run pr on all nodes, you repair 100% of data.
> Why to repair complete data in each cycle?? ..even data which has not even
> changed since the last repair cycle?
>
> This is where Incremental repair comes as an improvement. Once repaired, a
> data would be marked repaired so that the next repair cycle could just
> focus on repairing the delta. Now, lets go back to the example of 5 node
> cluster with RF =3.This time we run incremental repair on all nodes. When
> you repair entire data on node A, all 3 replicas are marked as repaired.
> Even if you run inc repair on all ranges on the second node, you would not
> re-repair the already repaired data. Thus, there is no advantage of
> repairing only the data owned by the node (primary range of the node). You
> can run inc repair on all the data present on a node and Cassandra would
> make sure that when you repair data on other nodes, you only repair
> unrepaired data.
>
> Thanks
> Anuj
>
>
>
> Sent from Yahoo Mail on Android
> <https://overview.mail.yahoo.com/mobile/?.src=Android>
>
> On Tue, Jun 6, 2017 at 4:27 PM, Chris Stokesmore
> <chris.elsm...@demandlogic.co> wrote:
> Hi all,
>
> Wondering if anyone had any thoughts on this? At the moment the long
> running repairs cause us to be running them on two nodes at once for a bit
> of time, which obivould increases the cluster load.
>
> On 2017-05-25 16:18 (+0100), Chris Stokesmore <c...@demandlogic.co>
> wrote:
> > Hi,>
> >
> > We are running a 7 node Cassandra 2.2.8 cluster, RF=3, and had been
> running repairs with the -pr option, via a cron job that runs on each node
> once per week.>
> >
> > We changed that as some advice on the Cassandra IRC channel said it
> would cause more anticompaction and  
> http://docs.datastax.com/en/archived/cassandra/2.2/cassandra/tools/toolsRepair.html
> says 'Performing partitioner range repairs by using the -pr option is
> generally considered a good choice for doing manual repairs. However, this
> option cannot be used with incremental repairs (default for Cassandra 2.2
> and later)'
> >
> > Only problem is our -pr repairs were taking about 8 hours, and now the
> non-pr repair are taking 24+ - I guess this makes sense, repairing 1/7 of
> data increased to 3/7, except I was hoping to see a speed up after the
> first loop through the cluster as each repair will be marking much more
> data as repaired, right?>
> >
> >
> > Is running -pr with incremental repairs really that bad? >
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>
>

Re: Partition range incremental repairs

Reply via email to