Re: How to start using incremental repairs?

2016-09-12 Thread Paulo Motta
> don't you think it might be better to keep applying the migration
procedure whatever the version ?

yes, it probably makes sense to keep the procedure for huge datasets to
avoid the cost of anti-compaction. but there seems to be some confusion
where people think it's *required* the procedure, while it's only
"recommended". For moderately-sized datasets it's probably simpler to just
pay the extra cost rather than deal with the extra complexity of the
migration procedure.

> Indeed, if you run an inc repair on all ranges on a node, it can skip
anticompation by just marking SSTables as being repaired (which is fast),
but the rest of the nodes will still have to perform anticompaction as they
won't share all of its token ranges. Right ?

right

2016-09-12 10:23 GMT-03:00 Alexander DEJANOVSKI :

> Hi Paulo,
>
> don't you think it might be better to keep applying the migration
> procedure whatever the version ?
> Anticompaction is pretty expensive on big SSTables and if the cluster has
> a lot of data, the first run might be very very long if the nodes are
> dense, and especially with a high number of vnodes.
> We've seen this on clusters that had just upgraded from 2.1 to 3.0, where
> the first incremental repair was taking a ridiculous amount of time as
> there were loads of anticompaction running.
>
> Indeed, if you run an inc repair on all ranges on a node, it can skip
> anticompation by just marking SSTables as being repaired (which is fast),
> but the rest of the nodes will still have to perform anticompaction as they
> won't share all of its token ranges. Right ?
>
> Cheers,
>
> Alex
>
> Le lun. 12 sept. 2016 à 13:56, Paulo Motta  a
> écrit :
>
>> > Can you clarify me please if what you said here applies for the
>> version 2.1.14.
>>
>> yes
>>
>> 2016-09-06 5:50 GMT-03:00 Jean Carlo :
>>
>>> Hi Paulo
>>>
>>> Can you clarify me please if what you said here
>>>
>>> 1. Migration procedure is no longer necessary after CASSANDRA-8004, and
>>> since you never ran repair before this would not make any difference
>>> anyway, so just run repair and by default (CASSANDRA-7250) this will
>>> already be incremental.
>>>
>>> applies for the version 2.1.14. I ask because I see that the jira
>>> CASSANDRA-8004 is resolved for the version 2.1.2 and we are considering to
>>> migrate to repairs inc before go to the version 3.0.x
>>>
>>> Thhx :)
>>>
>>>
>>> Saludos
>>>
>>> Jean Carlo
>>>
>>> "The best way to predict the future is to invent it" Alan Kay
>>>
>>> On Fri, Aug 26, 2016 at 9:04 PM, Stefano Ortolani 
>>> wrote:
>>>
 An extract of this conversation should definitely be posted somewhere.
 Read a lot but never learnt all these bits...

 On Fri, Aug 26, 2016 at 2:53 PM, Paulo Motta 
 wrote:

> > I must admit that I fail to understand currently how running repair
> with -pr could leave unrepaired data though, even when ran on all nodes in
> all DCs, and how that could be specific to incremental repair (and would
> appreciate if someone shared the explanation).
>
> Anti-compaction, which marks tables as repaired, is disabled for
> partial range repairs (which includes partitioner-range repair) to avoid
> the extra I/O cost of needing to run anti-compaction multiple times in a
> node to repair it completely. For example, there is an optimization which
> skips anti-compaction for sstables fully contained in the repaired range
> (only the repairedAt field is mutated), which is leveraged by full range
> repair, which would not work in many cases for partial range repairs,
> yielding higher I/O.
>
> 2016-08-26 10:17 GMT-03:00 Stefano Ortolani :
>
>> I see. Didn't think about it that way. Thanks for clarifying!
>>
>>
>> On Fri, Aug 26, 2016 at 2:14 PM, Paulo Motta <
>> pauloricard...@gmail.com> wrote:
>>
>>> > What is the underlying reason?
>>>
>>> Basically to minimize the amount of anti-compaction needed, since
>>> with RF=3 you'd need to perform anti-compaction 3 times in a particular
>>> node to get it fully repaired, while without it you can just repair the
>>> full node's range in one run. Assuming you run repair frequent enough 
>>> this
>>> will not be a big deal, since you will skip already repaired data in the
>>> next round so you will not have the problem of re-doing work as in 
>>> non-inc
>>> non-pr repair.
>>>
>>> 2016-08-26 7:57 GMT-03:00 Stefano Ortolani :
>>>
 Hi Paulo, could you elaborate on 2?
 I didn't know incremental repairs were not compatible with -pr
 What is the underlying reason?

 Regards,
 Stefano


 On Fri, Aug 26, 2016 at 1:25 AM, Paulo Motta <
 pauloricard...@gmail.com> wrote:


Re: How to start using incremental repairs?

2016-09-12 Thread Alexander DEJANOVSKI
Hi Paulo,

don't you think it might be better to keep applying the migration procedure
whatever the version ?
Anticompaction is pretty expensive on big SSTables and if the cluster has a
lot of data, the first run might be very very long if the nodes are dense,
and especially with a high number of vnodes.
We've seen this on clusters that had just upgraded from 2.1 to 3.0, where
the first incremental repair was taking a ridiculous amount of time as
there were loads of anticompaction running.

Indeed, if you run an inc repair on all ranges on a node, it can skip
anticompation by just marking SSTables as being repaired (which is fast),
but the rest of the nodes will still have to perform anticompaction as they
won't share all of its token ranges. Right ?

Cheers,

Alex

Le lun. 12 sept. 2016 à 13:56, Paulo Motta  a
écrit :

> > Can you clarify me please if what you said here applies for the version
> 2.1.14.
>
> yes
>
> 2016-09-06 5:50 GMT-03:00 Jean Carlo :
>
>> Hi Paulo
>>
>> Can you clarify me please if what you said here
>>
>> 1. Migration procedure is no longer necessary after CASSANDRA-8004, and
>> since you never ran repair before this would not make any difference
>> anyway, so just run repair and by default (CASSANDRA-7250) this will
>> already be incremental.
>>
>> applies for the version 2.1.14. I ask because I see that the jira
>> CASSANDRA-8004 is resolved for the version 2.1.2 and we are considering to
>> migrate to repairs inc before go to the version 3.0.x
>>
>> Thhx :)
>>
>>
>> Saludos
>>
>> Jean Carlo
>>
>> "The best way to predict the future is to invent it" Alan Kay
>>
>> On Fri, Aug 26, 2016 at 9:04 PM, Stefano Ortolani 
>> wrote:
>>
>>> An extract of this conversation should definitely be posted somewhere.
>>> Read a lot but never learnt all these bits...
>>>
>>> On Fri, Aug 26, 2016 at 2:53 PM, Paulo Motta 
>>> wrote:
>>>
 > I must admit that I fail to understand currently how running repair
 with -pr could leave unrepaired data though, even when ran on all nodes in
 all DCs, and how that could be specific to incremental repair (and would
 appreciate if someone shared the explanation).

 Anti-compaction, which marks tables as repaired, is disabled for
 partial range repairs (which includes partitioner-range repair) to avoid
 the extra I/O cost of needing to run anti-compaction multiple times in a
 node to repair it completely. For example, there is an optimization which
 skips anti-compaction for sstables fully contained in the repaired range
 (only the repairedAt field is mutated), which is leveraged by full range
 repair, which would not work in many cases for partial range repairs,
 yielding higher I/O.

 2016-08-26 10:17 GMT-03:00 Stefano Ortolani :

> I see. Didn't think about it that way. Thanks for clarifying!
>
>
> On Fri, Aug 26, 2016 at 2:14 PM, Paulo Motta  > wrote:
>
>> > What is the underlying reason?
>>
>> Basically to minimize the amount of anti-compaction needed, since
>> with RF=3 you'd need to perform anti-compaction 3 times in a particular
>> node to get it fully repaired, while without it you can just repair the
>> full node's range in one run. Assuming you run repair frequent enough 
>> this
>> will not be a big deal, since you will skip already repaired data in the
>> next round so you will not have the problem of re-doing work as in 
>> non-inc
>> non-pr repair.
>>
>> 2016-08-26 7:57 GMT-03:00 Stefano Ortolani :
>>
>>> Hi Paulo, could you elaborate on 2?
>>> I didn't know incremental repairs were not compatible with -pr
>>> What is the underlying reason?
>>>
>>> Regards,
>>> Stefano
>>>
>>>
>>> On Fri, Aug 26, 2016 at 1:25 AM, Paulo Motta <
>>> pauloricard...@gmail.com> wrote:
>>>
 1. Migration procedure is no longer necessary after CASSANDRA-8004,
 and since you never ran repair before this would not make any 
 difference
 anyway, so just run repair and by default (CASSANDRA-7250) this will
 already be incremental.
 2. Incremental repair is not supported with -pr, -local or -st/-et
 options, so you should run incremental repair in all nodes in all DCs
 sequentially (you should be aware that this will probably generate 
 inter-DC
 traffic), no need to disable autocompaction or stopping nodes.

 2016-08-25 18:27 GMT-03:00 Aleksandr Ivanov :

> I’m new in Cassandra and trying to figure out how to _start_ using
> incremental repairs. I have seen article about “Migrating to 
> incremental
> repairs” but since I didn’t use repairs before at all and I use 
> 

Re: How to start using incremental repairs?

2016-09-12 Thread Paulo Motta
> Can you clarify me please if what you said here applies for the version
2.1.14.

yes

2016-09-06 5:50 GMT-03:00 Jean Carlo :

> Hi Paulo
>
> Can you clarify me please if what you said here
>
> 1. Migration procedure is no longer necessary after CASSANDRA-8004, and
> since you never ran repair before this would not make any difference
> anyway, so just run repair and by default (CASSANDRA-7250) this will
> already be incremental.
>
> applies for the version 2.1.14. I ask because I see that the jira
> CASSANDRA-8004 is resolved for the version 2.1.2 and we are considering to
> migrate to repairs inc before go to the version 3.0.x
>
> Thhx :)
>
>
> Saludos
>
> Jean Carlo
>
> "The best way to predict the future is to invent it" Alan Kay
>
> On Fri, Aug 26, 2016 at 9:04 PM, Stefano Ortolani 
> wrote:
>
>> An extract of this conversation should definitely be posted somewhere.
>> Read a lot but never learnt all these bits...
>>
>> On Fri, Aug 26, 2016 at 2:53 PM, Paulo Motta 
>> wrote:
>>
>>> > I must admit that I fail to understand currently how running repair
>>> with -pr could leave unrepaired data though, even when ran on all nodes in
>>> all DCs, and how that could be specific to incremental repair (and would
>>> appreciate if someone shared the explanation).
>>>
>>> Anti-compaction, which marks tables as repaired, is disabled for partial
>>> range repairs (which includes partitioner-range repair) to avoid the extra
>>> I/O cost of needing to run anti-compaction multiple times in a node to
>>> repair it completely. For example, there is an optimization which skips
>>> anti-compaction for sstables fully contained in the repaired range (only
>>> the repairedAt field is mutated), which is leveraged by full range repair,
>>> which would not work in many cases for partial range repairs, yielding
>>> higher I/O.
>>>
>>> 2016-08-26 10:17 GMT-03:00 Stefano Ortolani :
>>>
 I see. Didn't think about it that way. Thanks for clarifying!


 On Fri, Aug 26, 2016 at 2:14 PM, Paulo Motta 
 wrote:

> > What is the underlying reason?
>
> Basically to minimize the amount of anti-compaction needed, since with
> RF=3 you'd need to perform anti-compaction 3 times in a particular node to
> get it fully repaired, while without it you can just repair the full 
> node's
> range in one run. Assuming you run repair frequent enough this will not be
> a big deal, since you will skip already repaired data in the next round so
> you will not have the problem of re-doing work as in non-inc non-pr 
> repair.
>
> 2016-08-26 7:57 GMT-03:00 Stefano Ortolani :
>
>> Hi Paulo, could you elaborate on 2?
>> I didn't know incremental repairs were not compatible with -pr
>> What is the underlying reason?
>>
>> Regards,
>> Stefano
>>
>>
>> On Fri, Aug 26, 2016 at 1:25 AM, Paulo Motta <
>> pauloricard...@gmail.com> wrote:
>>
>>> 1. Migration procedure is no longer necessary after CASSANDRA-8004,
>>> and since you never ran repair before this would not make any difference
>>> anyway, so just run repair and by default (CASSANDRA-7250) this will
>>> already be incremental.
>>> 2. Incremental repair is not supported with -pr, -local or -st/-et
>>> options, so you should run incremental repair in all nodes in all DCs
>>> sequentially (you should be aware that this will probably generate 
>>> inter-DC
>>> traffic), no need to disable autocompaction or stopping nodes.
>>>
>>> 2016-08-25 18:27 GMT-03:00 Aleksandr Ivanov :
>>>
 I’m new in Cassandra and trying to figure out how to _start_ using
 incremental repairs. I have seen article about “Migrating to 
 incremental
 repairs” but since I didn’t use repairs before at all and I use 
 Cassandra
 version v3.0.8, then maybe not all steps are needed which are 
 mentioned in
 Datastax article.
 Should I start with full repair or I can start with executing
 “nodetool repair -pr  my_keyspace” on all nodes without autocompaction
 disabling and node stopping?

 I have 6 datacenters with 6 nodes in each DC. Is it enough to run
  “nodetool repair -pr  my_keyspace” in one DC only or it should be 
 executed
 on all nodes in _all_ DCs?

 I have tried to perform “nodetool repair -pr  my_keyspace” on all
 nodes in all datacenters sequentially but I still can see non repaired
 SSTables for my_keyspace   (Repaired at: 0). Is it expected behavior if
 during repair data in my_keyspace wasn’t modified (no writes, no 
 reads)?

>>>
>>>
>>
>

>>>
>>
>


Re: How to start using incremental repairs?

2016-09-06 Thread Jean Carlo
Hi Paulo

Can you clarify me please if what you said here

1. Migration procedure is no longer necessary after CASSANDRA-8004, and
since you never ran repair before this would not make any difference
anyway, so just run repair and by default (CASSANDRA-7250) this will
already be incremental.

applies for the version 2.1.14. I ask because I see that the jira
CASSANDRA-8004 is resolved for the version 2.1.2 and we are considering to
migrate to repairs inc before go to the version 3.0.x

Thhx :)


Saludos

Jean Carlo

"The best way to predict the future is to invent it" Alan Kay

On Fri, Aug 26, 2016 at 9:04 PM, Stefano Ortolani 
wrote:

> An extract of this conversation should definitely be posted somewhere.
> Read a lot but never learnt all these bits...
>
> On Fri, Aug 26, 2016 at 2:53 PM, Paulo Motta 
> wrote:
>
>> > I must admit that I fail to understand currently how running repair
>> with -pr could leave unrepaired data though, even when ran on all nodes in
>> all DCs, and how that could be specific to incremental repair (and would
>> appreciate if someone shared the explanation).
>>
>> Anti-compaction, which marks tables as repaired, is disabled for partial
>> range repairs (which includes partitioner-range repair) to avoid the extra
>> I/O cost of needing to run anti-compaction multiple times in a node to
>> repair it completely. For example, there is an optimization which skips
>> anti-compaction for sstables fully contained in the repaired range (only
>> the repairedAt field is mutated), which is leveraged by full range repair,
>> which would not work in many cases for partial range repairs, yielding
>> higher I/O.
>>
>> 2016-08-26 10:17 GMT-03:00 Stefano Ortolani :
>>
>>> I see. Didn't think about it that way. Thanks for clarifying!
>>>
>>>
>>> On Fri, Aug 26, 2016 at 2:14 PM, Paulo Motta 
>>> wrote:
>>>
 > What is the underlying reason?

 Basically to minimize the amount of anti-compaction needed, since with
 RF=3 you'd need to perform anti-compaction 3 times in a particular node to
 get it fully repaired, while without it you can just repair the full node's
 range in one run. Assuming you run repair frequent enough this will not be
 a big deal, since you will skip already repaired data in the next round so
 you will not have the problem of re-doing work as in non-inc non-pr repair.

 2016-08-26 7:57 GMT-03:00 Stefano Ortolani :

> Hi Paulo, could you elaborate on 2?
> I didn't know incremental repairs were not compatible with -pr
> What is the underlying reason?
>
> Regards,
> Stefano
>
>
> On Fri, Aug 26, 2016 at 1:25 AM, Paulo Motta  > wrote:
>
>> 1. Migration procedure is no longer necessary after CASSANDRA-8004,
>> and since you never ran repair before this would not make any difference
>> anyway, so just run repair and by default (CASSANDRA-7250) this will
>> already be incremental.
>> 2. Incremental repair is not supported with -pr, -local or -st/-et
>> options, so you should run incremental repair in all nodes in all DCs
>> sequentially (you should be aware that this will probably generate 
>> inter-DC
>> traffic), no need to disable autocompaction or stopping nodes.
>>
>> 2016-08-25 18:27 GMT-03:00 Aleksandr Ivanov :
>>
>>> I’m new in Cassandra and trying to figure out how to _start_ using
>>> incremental repairs. I have seen article about “Migrating to incremental
>>> repairs” but since I didn’t use repairs before at all and I use 
>>> Cassandra
>>> version v3.0.8, then maybe not all steps are needed which are mentioned 
>>> in
>>> Datastax article.
>>> Should I start with full repair or I can start with executing
>>> “nodetool repair -pr  my_keyspace” on all nodes without autocompaction
>>> disabling and node stopping?
>>>
>>> I have 6 datacenters with 6 nodes in each DC. Is it enough to run
>>>  “nodetool repair -pr  my_keyspace” in one DC only or it should be 
>>> executed
>>> on all nodes in _all_ DCs?
>>>
>>> I have tried to perform “nodetool repair -pr  my_keyspace” on all
>>> nodes in all datacenters sequentially but I still can see non repaired
>>> SSTables for my_keyspace   (Repaired at: 0). Is it expected behavior if
>>> during repair data in my_keyspace wasn’t modified (no writes, no reads)?
>>>
>>
>>
>

>>>
>>
>


Re: How to start using incremental repairs?

2016-08-26 Thread Stefano Ortolani
An extract of this conversation should definitely be posted somewhere.
Read a lot but never learnt all these bits...

On Fri, Aug 26, 2016 at 2:53 PM, Paulo Motta 
wrote:

> > I must admit that I fail to understand currently how running repair with
> -pr could leave unrepaired data though, even when ran on all nodes in all
> DCs, and how that could be specific to incremental repair (and would
> appreciate if someone shared the explanation).
>
> Anti-compaction, which marks tables as repaired, is disabled for partial
> range repairs (which includes partitioner-range repair) to avoid the extra
> I/O cost of needing to run anti-compaction multiple times in a node to
> repair it completely. For example, there is an optimization which skips
> anti-compaction for sstables fully contained in the repaired range (only
> the repairedAt field is mutated), which is leveraged by full range repair,
> which would not work in many cases for partial range repairs, yielding
> higher I/O.
>
> 2016-08-26 10:17 GMT-03:00 Stefano Ortolani :
>
>> I see. Didn't think about it that way. Thanks for clarifying!
>>
>>
>> On Fri, Aug 26, 2016 at 2:14 PM, Paulo Motta 
>> wrote:
>>
>>> > What is the underlying reason?
>>>
>>> Basically to minimize the amount of anti-compaction needed, since with
>>> RF=3 you'd need to perform anti-compaction 3 times in a particular node to
>>> get it fully repaired, while without it you can just repair the full node's
>>> range in one run. Assuming you run repair frequent enough this will not be
>>> a big deal, since you will skip already repaired data in the next round so
>>> you will not have the problem of re-doing work as in non-inc non-pr repair.
>>>
>>> 2016-08-26 7:57 GMT-03:00 Stefano Ortolani :
>>>
 Hi Paulo, could you elaborate on 2?
 I didn't know incremental repairs were not compatible with -pr
 What is the underlying reason?

 Regards,
 Stefano


 On Fri, Aug 26, 2016 at 1:25 AM, Paulo Motta 
 wrote:

> 1. Migration procedure is no longer necessary after CASSANDRA-8004,
> and since you never ran repair before this would not make any difference
> anyway, so just run repair and by default (CASSANDRA-7250) this will
> already be incremental.
> 2. Incremental repair is not supported with -pr, -local or -st/-et
> options, so you should run incremental repair in all nodes in all DCs
> sequentially (you should be aware that this will probably generate 
> inter-DC
> traffic), no need to disable autocompaction or stopping nodes.
>
> 2016-08-25 18:27 GMT-03:00 Aleksandr Ivanov :
>
>> I’m new in Cassandra and trying to figure out how to _start_ using
>> incremental repairs. I have seen article about “Migrating to incremental
>> repairs” but since I didn’t use repairs before at all and I use Cassandra
>> version v3.0.8, then maybe not all steps are needed which are mentioned 
>> in
>> Datastax article.
>> Should I start with full repair or I can start with executing
>> “nodetool repair -pr  my_keyspace” on all nodes without autocompaction
>> disabling and node stopping?
>>
>> I have 6 datacenters with 6 nodes in each DC. Is it enough to run
>>  “nodetool repair -pr  my_keyspace” in one DC only or it should be 
>> executed
>> on all nodes in _all_ DCs?
>>
>> I have tried to perform “nodetool repair -pr  my_keyspace” on all
>> nodes in all datacenters sequentially but I still can see non repaired
>> SSTables for my_keyspace   (Repaired at: 0). Is it expected behavior if
>> during repair data in my_keyspace wasn’t modified (no writes, no reads)?
>>
>
>

>>>
>>
>


Re: How to start using incremental repairs?

2016-08-26 Thread Paulo Motta
> I must admit that I fail to understand currently how running repair with
-pr could leave unrepaired data though, even when ran on all nodes in all
DCs, and how that could be specific to incremental repair (and would
appreciate if someone shared the explanation).

Anti-compaction, which marks tables as repaired, is disabled for partial
range repairs (which includes partitioner-range repair) to avoid the extra
I/O cost of needing to run anti-compaction multiple times in a node to
repair it completely. For example, there is an optimization which skips
anti-compaction for sstables fully contained in the repaired range (only
the repairedAt field is mutated), which is leveraged by full range repair,
which would not work in many cases for partial range repairs, yielding
higher I/O.

2016-08-26 10:17 GMT-03:00 Stefano Ortolani :

> I see. Didn't think about it that way. Thanks for clarifying!
>
>
> On Fri, Aug 26, 2016 at 2:14 PM, Paulo Motta 
> wrote:
>
>> > What is the underlying reason?
>>
>> Basically to minimize the amount of anti-compaction needed, since with
>> RF=3 you'd need to perform anti-compaction 3 times in a particular node to
>> get it fully repaired, while without it you can just repair the full node's
>> range in one run. Assuming you run repair frequent enough this will not be
>> a big deal, since you will skip already repaired data in the next round so
>> you will not have the problem of re-doing work as in non-inc non-pr repair.
>>
>> 2016-08-26 7:57 GMT-03:00 Stefano Ortolani :
>>
>>> Hi Paulo, could you elaborate on 2?
>>> I didn't know incremental repairs were not compatible with -pr
>>> What is the underlying reason?
>>>
>>> Regards,
>>> Stefano
>>>
>>>
>>> On Fri, Aug 26, 2016 at 1:25 AM, Paulo Motta 
>>> wrote:
>>>
 1. Migration procedure is no longer necessary after CASSANDRA-8004, and
 since you never ran repair before this would not make any difference
 anyway, so just run repair and by default (CASSANDRA-7250) this will
 already be incremental.
 2. Incremental repair is not supported with -pr, -local or -st/-et
 options, so you should run incremental repair in all nodes in all DCs
 sequentially (you should be aware that this will probably generate inter-DC
 traffic), no need to disable autocompaction or stopping nodes.

 2016-08-25 18:27 GMT-03:00 Aleksandr Ivanov :

> I’m new in Cassandra and trying to figure out how to _start_ using
> incremental repairs. I have seen article about “Migrating to incremental
> repairs” but since I didn’t use repairs before at all and I use Cassandra
> version v3.0.8, then maybe not all steps are needed which are mentioned in
> Datastax article.
> Should I start with full repair or I can start with executing
> “nodetool repair -pr  my_keyspace” on all nodes without autocompaction
> disabling and node stopping?
>
> I have 6 datacenters with 6 nodes in each DC. Is it enough to run
>  “nodetool repair -pr  my_keyspace” in one DC only or it should be 
> executed
> on all nodes in _all_ DCs?
>
> I have tried to perform “nodetool repair -pr  my_keyspace” on all
> nodes in all datacenters sequentially but I still can see non repaired
> SSTables for my_keyspace   (Repaired at: 0). Is it expected behavior if
> during repair data in my_keyspace wasn’t modified (no writes, no reads)?
>


>>>
>>
>


Re: How to start using incremental repairs?

2016-08-26 Thread Stefano Ortolani
I see. Didn't think about it that way. Thanks for clarifying!

On Fri, Aug 26, 2016 at 2:14 PM, Paulo Motta 
wrote:

> > What is the underlying reason?
>
> Basically to minimize the amount of anti-compaction needed, since with
> RF=3 you'd need to perform anti-compaction 3 times in a particular node to
> get it fully repaired, while without it you can just repair the full node's
> range in one run. Assuming you run repair frequent enough this will not be
> a big deal, since you will skip already repaired data in the next round so
> you will not have the problem of re-doing work as in non-inc non-pr repair.
>
> 2016-08-26 7:57 GMT-03:00 Stefano Ortolani :
>
>> Hi Paulo, could you elaborate on 2?
>> I didn't know incremental repairs were not compatible with -pr
>> What is the underlying reason?
>>
>> Regards,
>> Stefano
>>
>>
>> On Fri, Aug 26, 2016 at 1:25 AM, Paulo Motta 
>> wrote:
>>
>>> 1. Migration procedure is no longer necessary after CASSANDRA-8004, and
>>> since you never ran repair before this would not make any difference
>>> anyway, so just run repair and by default (CASSANDRA-7250) this will
>>> already be incremental.
>>> 2. Incremental repair is not supported with -pr, -local or -st/-et
>>> options, so you should run incremental repair in all nodes in all DCs
>>> sequentially (you should be aware that this will probably generate inter-DC
>>> traffic), no need to disable autocompaction or stopping nodes.
>>>
>>> 2016-08-25 18:27 GMT-03:00 Aleksandr Ivanov :
>>>
 I’m new in Cassandra and trying to figure out how to _start_ using
 incremental repairs. I have seen article about “Migrating to incremental
 repairs” but since I didn’t use repairs before at all and I use Cassandra
 version v3.0.8, then maybe not all steps are needed which are mentioned in
 Datastax article.
 Should I start with full repair or I can start with executing “nodetool
 repair -pr  my_keyspace” on all nodes without autocompaction disabling and
 node stopping?

 I have 6 datacenters with 6 nodes in each DC. Is it enough to run
  “nodetool repair -pr  my_keyspace” in one DC only or it should be executed
 on all nodes in _all_ DCs?

 I have tried to perform “nodetool repair -pr  my_keyspace” on all nodes
 in all datacenters sequentially but I still can see non repaired SSTables
 for my_keyspace   (Repaired at: 0). Is it expected behavior if during
 repair data in my_keyspace wasn’t modified (no writes, no reads)?

>>>
>>>
>>
>


Re: How to start using incremental repairs?

2016-08-26 Thread Paulo Motta
> What is the underlying reason?

Basically to minimize the amount of anti-compaction needed, since with RF=3
you'd need to perform anti-compaction 3 times in a particular node to get
it fully repaired, while without it you can just repair the full node's
range in one run. Assuming you run repair frequent enough this will not be
a big deal, since you will skip already repaired data in the next round so
you will not have the problem of re-doing work as in non-inc non-pr repair.

2016-08-26 7:57 GMT-03:00 Stefano Ortolani :

> Hi Paulo, could you elaborate on 2?
> I didn't know incremental repairs were not compatible with -pr
> What is the underlying reason?
>
> Regards,
> Stefano
>
>
> On Fri, Aug 26, 2016 at 1:25 AM, Paulo Motta 
> wrote:
>
>> 1. Migration procedure is no longer necessary after CASSANDRA-8004, and
>> since you never ran repair before this would not make any difference
>> anyway, so just run repair and by default (CASSANDRA-7250) this will
>> already be incremental.
>> 2. Incremental repair is not supported with -pr, -local or -st/-et
>> options, so you should run incremental repair in all nodes in all DCs
>> sequentially (you should be aware that this will probably generate inter-DC
>> traffic), no need to disable autocompaction or stopping nodes.
>>
>> 2016-08-25 18:27 GMT-03:00 Aleksandr Ivanov :
>>
>>> I’m new in Cassandra and trying to figure out how to _start_ using
>>> incremental repairs. I have seen article about “Migrating to incremental
>>> repairs” but since I didn’t use repairs before at all and I use Cassandra
>>> version v3.0.8, then maybe not all steps are needed which are mentioned in
>>> Datastax article.
>>> Should I start with full repair or I can start with executing “nodetool
>>> repair -pr  my_keyspace” on all nodes without autocompaction disabling and
>>> node stopping?
>>>
>>> I have 6 datacenters with 6 nodes in each DC. Is it enough to run
>>>  “nodetool repair -pr  my_keyspace” in one DC only or it should be executed
>>> on all nodes in _all_ DCs?
>>>
>>> I have tried to perform “nodetool repair -pr  my_keyspace” on all nodes
>>> in all datacenters sequentially but I still can see non repaired SSTables
>>> for my_keyspace   (Repaired at: 0). Is it expected behavior if during
>>> repair data in my_keyspace wasn’t modified (no writes, no reads)?
>>>
>>
>>
>


Re: How to start using incremental repairs?

2016-08-26 Thread Alexander DEJANOVSKI
After running some tests I can confirm that using -pr leaves unrepaired
SSTables, while removing it shows repaired SSTables only once repair is
completed.

The purpose of -pr was to lighten the repair process by not repairing
ranges RF times, but just once. With incremental repair though, repaired
data is marked as such and will be skipped on the next session, making -pr
kinda useless.

I must admit that I fail to understand currently how running repair with
-pr could leave unrepaired data though, even when ran on all nodes in all
DCs, and how that could be specific to incremental repair (and would
appreciate if someone shared the explanation).

On a side note, I have a Spotify Reaper fork that handles incremental
repair, and embeds the UI of Stefan Podkowinski, tweaked to add incremental
repair inputs :
https://github.com/adejanovski/cassandra-reaper/tree/inc-repair-support-with-ui

Compile it with maven and run with : java -jar
target/cassandra-reaper-0.2.4-SNAPSHOT.jar server
resource/cassandra-reaper.yaml

Then go to http://127.0.0.1:8081/webui/



Le ven. 26 août 2016 à 12:58, Stefano Ortolani  a
écrit :

> Hi Paulo, could you elaborate on 2?
> I didn't know incremental repairs were not compatible with -pr
> What is the underlying reason?
>
> Regards,
> Stefano
>
>
> On Fri, Aug 26, 2016 at 1:25 AM, Paulo Motta 
> wrote:
>
>> 1. Migration procedure is no longer necessary after CASSANDRA-8004, and
>> since you never ran repair before this would not make any difference
>> anyway, so just run repair and by default (CASSANDRA-7250) this will
>> already be incremental.
>> 2. Incremental repair is not supported with -pr, -local or -st/-et
>> options, so you should run incremental repair in all nodes in all DCs
>> sequentially (you should be aware that this will probably generate inter-DC
>> traffic), no need to disable autocompaction or stopping nodes.
>>
>> 2016-08-25 18:27 GMT-03:00 Aleksandr Ivanov :
>>
>>> I’m new in Cassandra and trying to figure out how to _start_ using
>>> incremental repairs. I have seen article about “Migrating to incremental
>>> repairs” but since I didn’t use repairs before at all and I use Cassandra
>>> version v3.0.8, then maybe not all steps are needed which are mentioned in
>>> Datastax article.
>>> Should I start with full repair or I can start with executing “nodetool
>>> repair -pr  my_keyspace” on all nodes without autocompaction disabling and
>>> node stopping?
>>>
>>> I have 6 datacenters with 6 nodes in each DC. Is it enough to run
>>>  “nodetool repair -pr  my_keyspace” in one DC only or it should be executed
>>> on all nodes in _all_ DCs?
>>>
>>> I have tried to perform “nodetool repair -pr  my_keyspace” on all nodes
>>> in all datacenters sequentially but I still can see non repaired SSTables
>>> for my_keyspace   (Repaired at: 0). Is it expected behavior if during
>>> repair data in my_keyspace wasn’t modified (no writes, no reads)?
>>>
>>
>>
>


Re: How to start using incremental repairs?

2016-08-26 Thread Stefano Ortolani
Hi Paulo, could you elaborate on 2?
I didn't know incremental repairs were not compatible with -pr
What is the underlying reason?

Regards,
Stefano

On Fri, Aug 26, 2016 at 1:25 AM, Paulo Motta 
wrote:

> 1. Migration procedure is no longer necessary after CASSANDRA-8004, and
> since you never ran repair before this would not make any difference
> anyway, so just run repair and by default (CASSANDRA-7250) this will
> already be incremental.
> 2. Incremental repair is not supported with -pr, -local or -st/-et
> options, so you should run incremental repair in all nodes in all DCs
> sequentially (you should be aware that this will probably generate inter-DC
> traffic), no need to disable autocompaction or stopping nodes.
>
> 2016-08-25 18:27 GMT-03:00 Aleksandr Ivanov :
>
>> I’m new in Cassandra and trying to figure out how to _start_ using
>> incremental repairs. I have seen article about “Migrating to incremental
>> repairs” but since I didn’t use repairs before at all and I use Cassandra
>> version v3.0.8, then maybe not all steps are needed which are mentioned in
>> Datastax article.
>> Should I start with full repair or I can start with executing “nodetool
>> repair -pr  my_keyspace” on all nodes without autocompaction disabling and
>> node stopping?
>>
>> I have 6 datacenters with 6 nodes in each DC. Is it enough to run
>>  “nodetool repair -pr  my_keyspace” in one DC only or it should be executed
>> on all nodes in _all_ DCs?
>>
>> I have tried to perform “nodetool repair -pr  my_keyspace” on all nodes
>> in all datacenters sequentially but I still can see non repaired SSTables
>> for my_keyspace   (Repaired at: 0). Is it expected behavior if during
>> repair data in my_keyspace wasn’t modified (no writes, no reads)?
>>
>
>


Re: How to start using incremental repairs?

2016-08-26 Thread Aleksandr Ivanov
Thanks Alexander.
1. No reads and writes were enabled during repair on keyspace.
2. All repairs were started sequentially on all nodes one by one (new
repair started after repair completeon on previous node)

I'll dig deeper into the logs, maybe there are some records about reason
why some sstables remains unrepaired. But from the first look all repairs
finished without any problems accordingly to log.

пт, 26 Авг 2016 г., 9:18 Alexander DEJANOVSKI :

> There are 2 main reasons I see for still having unrepaired sstables after
> running nodetool repair -pr :
>
> 1- new data is still flowing in your database after the repair sessions
> were launched, and thus hasn't been repaired
> 2- some repair sessions failed and left unrepaired data on your nodes.
> Incremental repair isn't fond of concurrency, as an SSTable cannot be
> anticompacted and go through validation compaction at the same time. So if
> an SSTable is being anticompacted and another node asks for a merkle tree
> that involves it, it will fail with a message in the system.log saying that
> an sstable cannot be involved in more than one repair session at a time
> (search for validation failures in your cassandra log).
> Best chance to have it succeed IMHO is to run inc repair one node at a
> time.
>
> Le ven. 26 août 2016 à 08:02, Aleksandr Ivanov  a
> écrit :
>
>> Thanks for confirmation Paulo. Then my understanding of proccess was
>> correct.
>>
>> I'm curious why I still see unrepaired sstables after performing repair
>> -pr on all nodes in all datacenters...
>>
>> пт, 26 Авг 2016 г., 3:25 Paulo Motta :
>>
>>> 1. Migration procedure is no longer necessary after CASSANDRA-8004, and
>>> since you never ran repair before this would not make any difference
>>> anyway, so just run repair and by default (CASSANDRA-7250) this will
>>> already be incremental.
>>> 2. Incremental repair is not supported with -pr, -local or -st/-et
>>> options, so you should run incremental repair in all nodes in all DCs
>>> sequentially (you should be aware that this will probably generate inter-DC
>>> traffic), no need to disable autocompaction or stopping nodes.
>>>
>>> 2016-08-25 18:27 GMT-03:00 Aleksandr Ivanov :
>>>
 I’m new in Cassandra and trying to figure out how to _start_ using
 incremental repairs. I have seen article about “Migrating to incremental
 repairs” but since I didn’t use repairs before at all and I use Cassandra
 version v3.0.8, then maybe not all steps are needed which are mentioned in
 Datastax article.
 Should I start with full repair or I can start with executing “nodetool
 repair -pr  my_keyspace” on all nodes without autocompaction disabling and
 node stopping?

 I have 6 datacenters with 6 nodes in each DC. Is it enough to run
  “nodetool repair -pr  my_keyspace” in one DC only or it should be executed
 on all nodes in _all_ DCs?

 I have tried to perform “nodetool repair -pr  my_keyspace” on all nodes
 in all datacenters sequentially but I still can see non repaired SSTables
 for my_keyspace   (Repaired at: 0). Is it expected behavior if during
 repair data in my_keyspace wasn’t modified (no writes, no reads)?

>>>
>>>


Re: How to start using incremental repairs?

2016-08-26 Thread Alexander DEJANOVSKI
There are 2 main reasons I see for still having unrepaired sstables after
running nodetool repair -pr :

1- new data is still flowing in your database after the repair sessions
were launched, and thus hasn't been repaired
2- some repair sessions failed and left unrepaired data on your nodes.
Incremental repair isn't fond of concurrency, as an SSTable cannot be
anticompacted and go through validation compaction at the same time. So if
an SSTable is being anticompacted and another node asks for a merkle tree
that involves it, it will fail with a message in the system.log saying that
an sstable cannot be involved in more than one repair session at a time
(search for validation failures in your cassandra log).
Best chance to have it succeed IMHO is to run inc repair one node at a time.

Le ven. 26 août 2016 à 08:02, Aleksandr Ivanov  a écrit :

> Thanks for confirmation Paulo. Then my understanding of proccess was
> correct.
>
> I'm curious why I still see unrepaired sstables after performing repair
> -pr on all nodes in all datacenters...
>
> пт, 26 Авг 2016 г., 3:25 Paulo Motta :
>
>> 1. Migration procedure is no longer necessary after CASSANDRA-8004, and
>> since you never ran repair before this would not make any difference
>> anyway, so just run repair and by default (CASSANDRA-7250) this will
>> already be incremental.
>> 2. Incremental repair is not supported with -pr, -local or -st/-et
>> options, so you should run incremental repair in all nodes in all DCs
>> sequentially (you should be aware that this will probably generate inter-DC
>> traffic), no need to disable autocompaction or stopping nodes.
>>
>> 2016-08-25 18:27 GMT-03:00 Aleksandr Ivanov :
>>
>>> I’m new in Cassandra and trying to figure out how to _start_ using
>>> incremental repairs. I have seen article about “Migrating to incremental
>>> repairs” but since I didn’t use repairs before at all and I use Cassandra
>>> version v3.0.8, then maybe not all steps are needed which are mentioned in
>>> Datastax article.
>>> Should I start with full repair or I can start with executing “nodetool
>>> repair -pr  my_keyspace” on all nodes without autocompaction disabling and
>>> node stopping?
>>>
>>> I have 6 datacenters with 6 nodes in each DC. Is it enough to run
>>>  “nodetool repair -pr  my_keyspace” in one DC only or it should be executed
>>> on all nodes in _all_ DCs?
>>>
>>> I have tried to perform “nodetool repair -pr  my_keyspace” on all nodes
>>> in all datacenters sequentially but I still can see non repaired SSTables
>>> for my_keyspace   (Repaired at: 0). Is it expected behavior if during
>>> repair data in my_keyspace wasn’t modified (no writes, no reads)?
>>>
>>
>>


Re: How to start using incremental repairs?

2016-08-26 Thread Aleksandr Ivanov
Thanks for confirmation Paulo. Then my understanding of proccess was
correct.

I'm curious why I still see unrepaired sstables after performing repair -pr
on all nodes in all datacenters...

пт, 26 Авг 2016 г., 3:25 Paulo Motta :

> 1. Migration procedure is no longer necessary after CASSANDRA-8004, and
> since you never ran repair before this would not make any difference
> anyway, so just run repair and by default (CASSANDRA-7250) this will
> already be incremental.
> 2. Incremental repair is not supported with -pr, -local or -st/-et
> options, so you should run incremental repair in all nodes in all DCs
> sequentially (you should be aware that this will probably generate inter-DC
> traffic), no need to disable autocompaction or stopping nodes.
>
> 2016-08-25 18:27 GMT-03:00 Aleksandr Ivanov :
>
>> I’m new in Cassandra and trying to figure out how to _start_ using
>> incremental repairs. I have seen article about “Migrating to incremental
>> repairs” but since I didn’t use repairs before at all and I use Cassandra
>> version v3.0.8, then maybe not all steps are needed which are mentioned in
>> Datastax article.
>> Should I start with full repair or I can start with executing “nodetool
>> repair -pr  my_keyspace” on all nodes without autocompaction disabling and
>> node stopping?
>>
>> I have 6 datacenters with 6 nodes in each DC. Is it enough to run
>>  “nodetool repair -pr  my_keyspace” in one DC only or it should be executed
>> on all nodes in _all_ DCs?
>>
>> I have tried to perform “nodetool repair -pr  my_keyspace” on all nodes
>> in all datacenters sequentially but I still can see non repaired SSTables
>> for my_keyspace   (Repaired at: 0). Is it expected behavior if during
>> repair data in my_keyspace wasn’t modified (no writes, no reads)?
>>
>
>


Re: How to start using incremental repairs?

2016-08-25 Thread Paulo Motta
1. Migration procedure is no longer necessary after CASSANDRA-8004, and
since you never ran repair before this would not make any difference
anyway, so just run repair and by default (CASSANDRA-7250) this will
already be incremental.
2. Incremental repair is not supported with -pr, -local or -st/-et options,
so you should run incremental repair in all nodes in all DCs sequentially
(you should be aware that this will probably generate inter-DC traffic), no
need to disable autocompaction or stopping nodes.

2016-08-25 18:27 GMT-03:00 Aleksandr Ivanov :

> I’m new in Cassandra and trying to figure out how to _start_ using
> incremental repairs. I have seen article about “Migrating to incremental
> repairs” but since I didn’t use repairs before at all and I use Cassandra
> version v3.0.8, then maybe not all steps are needed which are mentioned in
> Datastax article.
> Should I start with full repair or I can start with executing “nodetool
> repair -pr  my_keyspace” on all nodes without autocompaction disabling and
> node stopping?
>
> I have 6 datacenters with 6 nodes in each DC. Is it enough to run
>  “nodetool repair -pr  my_keyspace” in one DC only or it should be executed
> on all nodes in _all_ DCs?
>
> I have tried to perform “nodetool repair -pr  my_keyspace” on all nodes in
> all datacenters sequentially but I still can see non repaired SSTables
> for my_keyspace   (Repaired at: 0). Is it expected behavior if during
> repair data in my_keyspace wasn’t modified (no writes, no reads)?
>