Re: [EXTERNAL] Re: Increasing VNodes

2017-10-04 Thread Chris Lohfink
Cant you just increase segmentCount option to split it more?

On Wed, Oct 4, 2017 at 12:50 PM, Mohapatra, Kishore <
kishore.mohapa...@nuance.com> wrote:

> Thanks a lot for all of your input. We are actually using Cassandra
> reaper. But it is just splitting the ranges into 256 per node.
>
> But I will certainly try out splitting into smaller ranges going thru the
> system.size_estimate table.
>
>
>
> Thanks
>
>
>
> *Kishore Mohapatra*
>
> Principal Operations DBA
>
> Seattle, WA
>
> Email : kishore.mohapa...@nuance.com
>
>
>
>
>
> *From:* Jon Haddad [mailto:jonathan.had...@gmail.com] * On Behalf Of *Jon
> Haddad
> *Sent:* Wednesday, October 04, 2017 10:27 AM
> *To:* user <user@cassandra.apache.org>
> *Subject:* [EXTERNAL] Re: Increasing VNodes
>
>
>
> The site (with the docs) is probably more helpful to learn about how
> reaper works:  http://cassandra-reaper.io/
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__cassandra-2Dreaper.io_=DwMFAg=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0rrLsOzY=O20_rcIS1QazTO3_J10I1cPIygxnuBZ4sUCz1TS16XE=nHN7toaSQUjfwSABx1KXlVHLYmlaEcUMYPHzC3ky5TM=oqOGMK4a6er4kSRtzs7B2_A6QB6kb7nQek8NAU5pytI=>
>
>
>
> On Oct 4, 2017, at 9:54 AM, Chris Lohfink <clohfin...@gmail.com> wrote:
>
>
>
> Increasing number of tokens will make repairs worse not better. You can
> just split the sub ranges into smaller chunks, you dont need to use vnodes
> to do that. Simple approach is to iterate through each host token range and
> split by N and repair them (ie https://github.com/onzra/
> cassandra_range_repair
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_onzra_cassandra-5Frange-5Frepair=DwMFAg=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0rrLsOzY=O20_rcIS1QazTO3_J10I1cPIygxnuBZ4sUCz1TS16XE=nHN7toaSQUjfwSABx1KXlVHLYmlaEcUMYPHzC3ky5TM=Ph50r9wV17T72OEwI3FsbAXBVZ3Pt-AmACQZYdsQqgk=>)
> To be more efficient you can grab ranges and split based on number of
> partitions in the range (ie fetch system.size_estimates and walk that) so
> you dont split empty or small ranges a ton unnecessarily, and because not
> all tables have some fixed N that is efficient.
>
>
>
> Using TLPs reaper https://github.com/thelastpickle/cassandra-reaper
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_thelastpickle_cassandra-2Dreaper=DwMFAg=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0rrLsOzY=O20_rcIS1QazTO3_J10I1cPIygxnuBZ4sUCz1TS16XE=nHN7toaSQUjfwSABx1KXlVHLYmlaEcUMYPHzC3ky5TM=_4VgTSxIgqGn339jpHMycnHg4bmM0pHmUxSQ8nNfdDU=>
>  or
> DataStax OpsCenter's repair service is easiest solution without a lot of
> effort. Repairs are hard.
>
>
>
> Chris
>
>
>
> On Wed, Oct 4, 2017 at 11:48 AM, Jeff Jirsa <jji...@gmail.com> wrote:
>
> You don't need to change the number of vnodes, you can manually select
> CONTAINED token subranges and pass in -st and -et (just try to pick a
> number > 2^20 that is fully contained by at least one vnode).
>
>
>
>
>
>
>
>
>
> On Wed, Oct 4, 2017 at 9:46 AM, Mohapatra, Kishore <
> kishore.mohapa...@nuance.com> wrote:
>
> Hi,
>
> We are having a lot of problems in repair process. We use sub
> range repair. But most of the time, some ranges fails with streaming error
> or some other kind of error.
>
> So wondering if it will help if we increase the no. of VNodes from 256
> (default) to 512. But increasing the VNodes will be a lot of efforts, as it
> involves wiping out the data and bootstrapping.
>
> So is there any other way of splitting the range into small ranges ?
>
>
>
> We are using version 2.1.15.4 at the moment.
>
>
>
> Thanks
>
>
>
> *Kishore Mohapatra*
>
> Principal Operations DBA
>
> Seattle, WA
>
> Email : kishore.mohapa...@nuance.com
>
>
>
>
>
>
>
>
>
>
>


RE: [EXTERNAL] Re: Increasing VNodes

2017-10-04 Thread Mohapatra, Kishore
Thanks a lot for all of your input. We are actually using Cassandra reaper. But 
it is just splitting the ranges into 256 per node.
But I will certainly try out splitting into smaller ranges going thru the 
system.size_estimate table.

Thanks

Kishore Mohapatra
Principal Operations DBA
Seattle, WA
Email : kishore.mohapa...@nuance.com<mailto:kishore.mohapa...@nuance.com>


From: Jon Haddad [mailto:jonathan.had...@gmail.com] On Behalf Of Jon Haddad
Sent: Wednesday, October 04, 2017 10:27 AM
To: user <user@cassandra.apache.org>
Subject: [EXTERNAL] Re: Increasing VNodes

The site (with the docs) is probably more helpful to learn about how reaper 
works:  
http://cassandra-reaper.io/<https://urldefense.proofpoint.com/v2/url?u=http-3A__cassandra-2Dreaper.io_=DwMFAg=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0rrLsOzY=O20_rcIS1QazTO3_J10I1cPIygxnuBZ4sUCz1TS16XE=nHN7toaSQUjfwSABx1KXlVHLYmlaEcUMYPHzC3ky5TM=oqOGMK4a6er4kSRtzs7B2_A6QB6kb7nQek8NAU5pytI=>

On Oct 4, 2017, at 9:54 AM, Chris Lohfink 
<clohfin...@gmail.com<mailto:clohfin...@gmail.com>> wrote:

Increasing number of tokens will make repairs worse not better. You can just 
split the sub ranges into smaller chunks, you dont need to use vnodes to do 
that. Simple approach is to iterate through each host token range and split by 
N and repair them (ie 
https://github.com/onzra/cassandra_range_repair<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_onzra_cassandra-5Frange-5Frepair=DwMFAg=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0rrLsOzY=O20_rcIS1QazTO3_J10I1cPIygxnuBZ4sUCz1TS16XE=nHN7toaSQUjfwSABx1KXlVHLYmlaEcUMYPHzC3ky5TM=Ph50r9wV17T72OEwI3FsbAXBVZ3Pt-AmACQZYdsQqgk=>)
  To be more efficient you can grab ranges and split based on number of 
partitions in the range (ie fetch system.size_estimates and walk that) so you 
dont split empty or small ranges a ton unnecessarily, and because not all 
tables have some fixed N that is efficient.

Using TLPs reaper 
https://github.com/thelastpickle/cassandra-reaper<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_thelastpickle_cassandra-2Dreaper=DwMFAg=djjh8EKwHtOepW4Bjau0lKhLlu-DxM1dlgP0rrLsOzY=O20_rcIS1QazTO3_J10I1cPIygxnuBZ4sUCz1TS16XE=nHN7toaSQUjfwSABx1KXlVHLYmlaEcUMYPHzC3ky5TM=_4VgTSxIgqGn339jpHMycnHg4bmM0pHmUxSQ8nNfdDU=>
 or DataStax OpsCenter's repair service is easiest solution without a lot of 
effort. Repairs are hard.

Chris

On Wed, Oct 4, 2017 at 11:48 AM, Jeff Jirsa 
<jji...@gmail.com<mailto:jji...@gmail.com>> wrote:
You don't need to change the number of vnodes, you can manually select 
CONTAINED token subranges and pass in -st and -et (just try to pick a number > 
2^20 that is fully contained by at least one vnode).




On Wed, Oct 4, 2017 at 9:46 AM, Mohapatra, Kishore 
<kishore.mohapa...@nuance.com<mailto:kishore.mohapa...@nuance.com>> wrote:
Hi,
We are having a lot of problems in repair process. We use sub range 
repair. But most of the time, some ranges fails with streaming error or some 
other kind of error.
So wondering if it will help if we increase the no. of VNodes from 256 
(default) to 512. But increasing the VNodes will be a lot of efforts, as it 
involves wiping out the data and bootstrapping.
So is there any other way of splitting the range into small ranges ?

We are using version 2.1.15.4 at the moment.

Thanks

Kishore Mohapatra
Principal Operations DBA
Seattle, WA
Email : kishore.mohapa...@nuance.com<mailto:kishore.mohapa...@nuance.com>







Re: Increasing VNodes

2017-10-04 Thread Jon Haddad
The site (with the docs) is probably more helpful to learn about how reaper 
works:  http://cassandra-reaper.io/ 
> On Oct 4, 2017, at 9:54 AM, Chris Lohfink  wrote:
> 
> Increasing number of tokens will make repairs worse not better. You can just 
> split the sub ranges into smaller chunks, you dont need to use vnodes to do 
> that. Simple approach is to iterate through each host token range and split 
> by N and repair them (ie https://github.com/onzra/cassandra_range_repair 
> )  To be more efficient you 
> can grab ranges and split based on number of partitions in the range (ie 
> fetch system.size_estimates and walk that) so you dont split empty or small 
> ranges a ton unnecessarily, and because not all tables have some fixed N that 
> is efficient.
> 
> Using TLPs reaper https://github.com/thelastpickle/cassandra-reaper 
>  or DataStax OpsCenter's 
> repair service is easiest solution without a lot of effort. Repairs are hard.
> 
> Chris
> 
> On Wed, Oct 4, 2017 at 11:48 AM, Jeff Jirsa  > wrote:
> You don't need to change the number of vnodes, you can manually select 
> CONTAINED token subranges and pass in -st and -et (just try to pick a number 
> > 2^20 that is fully contained by at least one vnode).
> 
> 
> 
> 
> On Wed, Oct 4, 2017 at 9:46 AM, Mohapatra, Kishore 
> > wrote:
> Hi,
> 
> We are having a lot of problems in repair process. We use sub range 
> repair. But most of the time, some ranges fails with streaming error or some 
> other kind of error.
> 
> So wondering if it will help if we increase the no. of VNodes from 256 
> (default) to 512. But increasing the VNodes will be a lot of efforts, as it 
> involves wiping out the data and bootstrapping.
> 
> So is there any other way of splitting the range into small ranges ?
> 
>  
> 
> We are using version 2.1.15.4 at the moment.
> 
>  
> 
> Thanks
> 
>  
> 
> Kishore Mohapatra
> 
> Principal Operations DBA
> 
> Seattle, WA
> 
> Email : kishore.mohapa...@nuance.com 
>  
> 
>  
> 
> 
> 



Re: Increasing VNodes

2017-10-04 Thread Chris Lohfink
Increasing number of tokens will make repairs worse not better. You can
just split the sub ranges into smaller chunks, you dont need to use vnodes
to do that. Simple approach is to iterate through each host token range and
split by N and repair them (ie
https://github.com/onzra/cassandra_range_repair)  To be more efficient you
can grab ranges and split based on number of partitions in the range (ie
fetch system.size_estimates and walk that) so you dont split empty or small
ranges a ton unnecessarily, and because not all tables have some fixed N
that is efficient.

Using TLPs reaper https://github.com/thelastpickle/cassandra-reaper or
DataStax OpsCenter's repair service is easiest solution without a lot of
effort. Repairs are hard.

Chris

On Wed, Oct 4, 2017 at 11:48 AM, Jeff Jirsa  wrote:

> You don't need to change the number of vnodes, you can manually select
> CONTAINED token subranges and pass in -st and -et (just try to pick a
> number > 2^20 that is fully contained by at least one vnode).
>
>
>
>
> On Wed, Oct 4, 2017 at 9:46 AM, Mohapatra, Kishore <
> kishore.mohapa...@nuance.com> wrote:
>
>> Hi,
>>
>> We are having a lot of problems in repair process. We use sub
>> range repair. But most of the time, some ranges fails with streaming error
>> or some other kind of error.
>>
>> So wondering if it will help if we increase the no. of VNodes from 256
>> (default) to 512. But increasing the VNodes will be a lot of efforts, as it
>> involves wiping out the data and bootstrapping.
>>
>> So is there any other way of splitting the range into small ranges ?
>>
>>
>>
>> We are using version 2.1.15.4 at the moment.
>>
>>
>>
>> Thanks
>>
>>
>>
>> *Kishore Mohapatra*
>>
>> Principal Operations DBA
>>
>> Seattle, WA
>>
>> Email : kishore.mohapa...@nuance.com
>>
>>
>>
>>
>>
>
>


Re: Increasing VNodes

2017-10-04 Thread Jeff Jirsa
You don't need to change the number of vnodes, you can manually select
CONTAINED token subranges and pass in -st and -et (just try to pick a
number > 2^20 that is fully contained by at least one vnode).




On Wed, Oct 4, 2017 at 9:46 AM, Mohapatra, Kishore <
kishore.mohapa...@nuance.com> wrote:

> Hi,
>
> We are having a lot of problems in repair process. We use sub
> range repair. But most of the time, some ranges fails with streaming error
> or some other kind of error.
>
> So wondering if it will help if we increase the no. of VNodes from 256
> (default) to 512. But increasing the VNodes will be a lot of efforts, as it
> involves wiping out the data and bootstrapping.
>
> So is there any other way of splitting the range into small ranges ?
>
>
>
> We are using version 2.1.15.4 at the moment.
>
>
>
> Thanks
>
>
>
> *Kishore Mohapatra*
>
> Principal Operations DBA
>
> Seattle, WA
>
> Email : kishore.mohapa...@nuance.com
>
>
>
>
>


Increasing VNodes

2017-10-04 Thread Mohapatra, Kishore
Hi,
We are having a lot of problems in repair process. We use sub range 
repair. But most of the time, some ranges fails with streaming error or some 
other kind of error.
So wondering if it will help if we increase the no. of VNodes from 256 
(default) to 512. But increasing the VNodes will be a lot of efforts, as it 
involves wiping out the data and bootstrapping.
So is there any other way of splitting the range into small ranges ?

We are using version 2.1.15.4 at the moment.

Thanks

Kishore Mohapatra
Principal Operations DBA
Seattle, WA
Email : kishore.mohapa...@nuance.com