Re: Is it possible / makes it sense to limit concurrent streaming during bootstrapping new nodes?

2018-02-24 Thread Kyrylo Lebediev
oh.. I'm sorry, that's my fault. Thank you for correction, Jon. While it's 
possible in theory, didn't figure out that 256-token nodes will get relatively 
more and more data with each next decom of  256-token node.

Regards,
Kyrill


From: Jon Haddad  on behalf of Jon Haddad 

Sent: Saturday, February 24, 2018 5:44:24 PM
To: user@cassandra.apache.org
Subject: Re: Is it possible / makes it sense to limit concurrent streaming 
during bootstrapping new nodes?

You can’t migrate down that way.  The last several nodes you have up will get 
completely overwhelmed, and you’ll be completely screwed.  Please do not give 
advice like this unless you’ve actually gone through the process or at least 
have an understanding of how the data will be shifted.  Adding nodes with 16 
tokens while decommissioning the ones with 256 will be absolute hell.

You can only do this by adding a new DC and retiring the old.

On Feb 24, 2018, at 2:26 AM, Kyrylo Lebediev 
> wrote:

> By the way, is it possible to migrate towards to smaller token ranges? What 
> is the recommended way doing so?
 - Didn't see this question answered. I think, be easiest way to do this is to 
add new C* nodes with lower vnodes (8, 16 instead of default 256) then decom 
old nodes with vnodes=256.

Thanks, guys, for shedding some light on this Java multithread-related 
scalability issue. BTW how to understand from JVM / OS metrics that number of 
threads for a JVM becomes a bottleneck?

Also, I'd like to add a comment: the higher number of vnodes per a node the 
lower overall reliability of the cluster. Replicas for a token range  are 
placed on the nodes responsible for next+1, next+2  ranges  (not taking into 
account NetworkTopologyStrategy / Snitch which help but seemingly not so much 
expressing in terms of probabilities). The higher number of vnodes per a node, 
the higher probability all nodes in the cluster will become 'neighbors' in 
terms of token ranges.
It's not a trivial formula for 'reliability' of C* cluster [haven't got a 
chance to do math], but in general, having a bigger number of nodes in a 
cluster (like 100 or 200), probability of 2 or more nodes are down at the same 
time increases proportionally the the number of nodes.

The most reliable C* setup is using initial_token instead of vnodes.
But this makes manageability of C* cluster worse [not so automated + there will 
hotshots in the cluster in some cases].

Remark: for  C* cluster with RF=3 any number of nodes and initial_token/vnodes 
setup there is always a possibility that simultaneous unavailability of 2(or 3, 
depending on which CL is used) nodes will lead to unavailability of a token 
range ('HostUnavailable' exception).
No miracles: reliability is mostly determined by RF number.

Which task must be solved for large clusters: "Reliability of a cluster with 
NNN nodes and RF=3 shouldn't be 'tangibly' less than reliability of 3-nodes 
cluster with RF=3"

Kind Regards,
Kyrill

From: Jürgen Albersdorfer 
>
Sent: Tuesday, February 20, 2018 10:34:21 PM
To: user@cassandra.apache.org
Subject: Re: Is it possible / makes it sense to limit concurrent streaming 
during bootstrapping new nodes?

Thanks Jeff,
your answer is really not what I expected to learn - which is again more manual 
doing as soon as we start really using C*. But I‘m happy to be able to learn it 
now and have still time to learn the neccessary Skills and ask the right 
questions on how to correctly drive big data with C* until we actually start 
using it, and I‘m glad to have People like you around caring about this 
questions. Thanks. This still convinces me having bet on the right horse, even 
when it might become a rough ride.

By the way, is it possible to migrate towards to smaller token ranges? What is 
the recommended way doing so? And which number of nodes is the typical ‚break 
even‘?

Von meinem iPhone gesendet

Am 20.02.2018 um 21:05 schrieb Jeff Jirsa 
>:

The scenario you describe is the typical point where people move away from 
vnodes and towards single-token-per-node (or a much smaller number of vnodes).

The default setting puts you in a situation where virtually all hosts are 
adjacent/neighbors to all others (at least until you're way into the hundreds 
of hosts), which means you'll stream from nearly all hosts. If you drop the 
number of vnodes from ~256 to ~4 or ~8 or ~16, you'll see the number of streams 
drop as well.

Many people with "large" clusters statically allocate tokens to make it 
predictable - if you have a single token per host, you can add multiple hosts 
at a time, each streaming from a small number of neighbors, without overlap.

It takes a bit more tooling (or manual token 

Re: Is it possible / makes it sense to limit concurrent streaming during bootstrapping new nodes?

2018-02-24 Thread Jon Haddad
We don’t have this documented *anywhere* right now, I’ve created a JIRA to 
update the site with the relevant info on this topic: 
https://issues.apache.org/jira/browse/CASSANDRA-14258 




> On Feb 24, 2018, at 7:44 AM, Jon Haddad  wrote:
> 
> You can’t migrate down that way.  The last several nodes you have up will get 
> completely overwhelmed, and you’ll be completely screwed.  Please do not give 
> advice like this unless you’ve actually gone through the process or at least 
> have an understanding of how the data will be shifted.  Adding nodes with 16 
> tokens while decommissioning the ones with 256 will be absolute hell.
> 
> You can only do this by adding a new DC and retiring the old.
> 
>> On Feb 24, 2018, at 2:26 AM, Kyrylo Lebediev > > wrote:
>> 
>> > By the way, is it possible to migrate towards to smaller token ranges? 
>> > What is the recommended way doing so?
>>  - Didn't see this question answered. I think, be easiest way to do this is 
>> to add new C* nodes with lower vnodes (8, 16 instead of default 256) then 
>> decom old nodes with vnodes=256.
>> 
>> Thanks, guys, for shedding some light on this Java multithread-related 
>> scalability issue. BTW how to understand from JVM / OS metrics that number 
>> of threads for a JVM becomes a bottleneck? 
>> 
>> Also, I'd like to add a comment: the higher number of vnodes per a node the 
>> lower overall reliability of the cluster. Replicas for a token range  are 
>> placed on the nodes responsible for next+1, next+2  ranges  (not taking into 
>> account NetworkTopologyStrategy / Snitch which help but seemingly not so 
>> much expressing in terms of probabilities). The higher number of vnodes per 
>> a node, the higher probability all nodes in the cluster will become 
>> 'neighbors' in terms of token ranges.
>> It's not a trivial formula for 'reliability' of C* cluster [haven't got a 
>> chance to do math], but in general, having a bigger number of nodes in a 
>> cluster (like 100 or 200), probability of 2 or more nodes are down at the 
>> same time increases proportionally the the number of nodes.  
>> 
>> The most reliable C* setup is using initial_token instead of vnodes. 
>> But this makes manageability of C* cluster worse [not so automated + there 
>> will hotshots in the cluster in some cases]. 
>> 
>> Remark: for  C* cluster with RF=3 any number of nodes and 
>> initial_token/vnodes setup there is always a possibility that simultaneous 
>> unavailability of 2(or 3, depending on which CL is used) nodes will lead to 
>> unavailability of a token range ('HostUnavailable' exception). 
>> No miracles: reliability is mostly determined by RF number. 
>> 
>> Which task must be solved for large clusters: "Reliability of a cluster with 
>> NNN nodes and RF=3 shouldn't be 'tangibly' less than reliability of 3-nodes 
>> cluster with RF=3"
>> 
>> Kind Regards, 
>> Kyrill
>> From: Jürgen Albersdorfer > >
>> Sent: Tuesday, February 20, 2018 10:34:21 PM
>> To: user@cassandra.apache.org 
>> Subject: Re: Is it possible / makes it sense to limit concurrent streaming 
>> during bootstrapping new nodes?
>>  
>> Thanks Jeff,
>> your answer is really not what I expected to learn - which is again more 
>> manual doing as soon as we start really using C*. But I‘m happy to be able 
>> to learn it now and have still time to learn the neccessary Skills and ask 
>> the right questions on how to correctly drive big data with C* until we 
>> actually start using it, and I‘m glad to have People like you around caring 
>> about this questions. Thanks. This still convinces me having bet on the 
>> right horse, even when it might become a rough ride.
>> 
>> By the way, is it possible to migrate towards to smaller token ranges? What 
>> is the recommended way doing so? And which number of nodes is the typical 
>> ‚break even‘?
>> 
>> Von meinem iPhone gesendet
>> 
>> Am 20.02.2018 um 21:05 schrieb Jeff Jirsa > >:
>> 
>>> The scenario you describe is the typical point where people move away from 
>>> vnodes and towards single-token-per-node (or a much smaller number of 
>>> vnodes).
>>> 
>>> The default setting puts you in a situation where virtually all hosts are 
>>> adjacent/neighbors to all others (at least until you're way into the 
>>> hundreds of hosts), which means you'll stream from nearly all hosts. If you 
>>> drop the number of vnodes from ~256 to ~4 or ~8 or ~16, you'll see the 
>>> number of streams drop as well.
>>> 
>>> Many people with "large" clusters statically allocate tokens to make it 
>>> predictable - if you have a single token per host, you can add multiple 
>>> hosts at a time, each streaming from a small number of neighbors, without 
>>> overlap.
>>> 
>>> It takes a bit 

Re: Is it possible / makes it sense to limit concurrent streaming during bootstrapping new nodes?

2018-02-24 Thread Jon Haddad
You can’t migrate down that way.  The last several nodes you have up will get 
completely overwhelmed, and you’ll be completely screwed.  Please do not give 
advice like this unless you’ve actually gone through the process or at least 
have an understanding of how the data will be shifted.  Adding nodes with 16 
tokens while decommissioning the ones with 256 will be absolute hell.

You can only do this by adding a new DC and retiring the old.

> On Feb 24, 2018, at 2:26 AM, Kyrylo Lebediev  wrote:
> 
> > By the way, is it possible to migrate towards to smaller token ranges? What 
> > is the recommended way doing so?
>  - Didn't see this question answered. I think, be easiest way to do this is 
> to add new C* nodes with lower vnodes (8, 16 instead of default 256) then 
> decom old nodes with vnodes=256.
> 
> Thanks, guys, for shedding some light on this Java multithread-related 
> scalability issue. BTW how to understand from JVM / OS metrics that number of 
> threads for a JVM becomes a bottleneck? 
> 
> Also, I'd like to add a comment: the higher number of vnodes per a node the 
> lower overall reliability of the cluster. Replicas for a token range  are 
> placed on the nodes responsible for next+1, next+2  ranges  (not taking into 
> account NetworkTopologyStrategy / Snitch which help but seemingly not so much 
> expressing in terms of probabilities). The higher number of vnodes per a 
> node, the higher probability all nodes in the cluster will become 'neighbors' 
> in terms of token ranges.
> It's not a trivial formula for 'reliability' of C* cluster [haven't got a 
> chance to do math], but in general, having a bigger number of nodes in a 
> cluster (like 100 or 200), probability of 2 or more nodes are down at the 
> same time increases proportionally the the number of nodes.  
> 
> The most reliable C* setup is using initial_token instead of vnodes. 
> But this makes manageability of C* cluster worse [not so automated + there 
> will hotshots in the cluster in some cases]. 
> 
> Remark: for  C* cluster with RF=3 any number of nodes and 
> initial_token/vnodes setup there is always a possibility that simultaneous 
> unavailability of 2(or 3, depending on which CL is used) nodes will lead to 
> unavailability of a token range ('HostUnavailable' exception). 
> No miracles: reliability is mostly determined by RF number. 
> 
> Which task must be solved for large clusters: "Reliability of a cluster with 
> NNN nodes and RF=3 shouldn't be 'tangibly' less than reliability of 3-nodes 
> cluster with RF=3"
> 
> Kind Regards, 
> Kyrill
> From: Jürgen Albersdorfer 
> Sent: Tuesday, February 20, 2018 10:34:21 PM
> To: user@cassandra.apache.org
> Subject: Re: Is it possible / makes it sense to limit concurrent streaming 
> during bootstrapping new nodes?
>  
> Thanks Jeff,
> your answer is really not what I expected to learn - which is again more 
> manual doing as soon as we start really using C*. But I‘m happy to be able to 
> learn it now and have still time to learn the neccessary Skills and ask the 
> right questions on how to correctly drive big data with C* until we actually 
> start using it, and I‘m glad to have People like you around caring about this 
> questions. Thanks. This still convinces me having bet on the right horse, 
> even when it might become a rough ride.
> 
> By the way, is it possible to migrate towards to smaller token ranges? What 
> is the recommended way doing so? And which number of nodes is the typical 
> ‚break even‘?
> 
> Von meinem iPhone gesendet
> 
> Am 20.02.2018 um 21:05 schrieb Jeff Jirsa  >:
> 
>> The scenario you describe is the typical point where people move away from 
>> vnodes and towards single-token-per-node (or a much smaller number of 
>> vnodes).
>> 
>> The default setting puts you in a situation where virtually all hosts are 
>> adjacent/neighbors to all others (at least until you're way into the 
>> hundreds of hosts), which means you'll stream from nearly all hosts. If you 
>> drop the number of vnodes from ~256 to ~4 or ~8 or ~16, you'll see the 
>> number of streams drop as well.
>> 
>> Many people with "large" clusters statically allocate tokens to make it 
>> predictable - if you have a single token per host, you can add multiple 
>> hosts at a time, each streaming from a small number of neighbors, without 
>> overlap.
>> 
>> It takes a bit more tooling (or manual token calculation) outside of 
>> cassandra, but works well in practice for "large" clusters.
>> 
>> 
>> 
>> 
>> On Tue, Feb 20, 2018 at 4:42 AM, Jürgen Albersdorfer 
>> > wrote:
>> Hi, I'm wondering if it is possible resp. would it make sense to limit 
>> concurrent streaming when joining a new node to cluster.
>> 
>> I'm currently operating a 15-Node C* Cluster (V 3.11.1) and joining another 
>> Node every day.
>> The 'nodetool netstats' 

Re: Is it possible / makes it sense to limit concurrent streaming during bootstrapping new nodes?

2018-02-24 Thread Kyrylo Lebediev
> By the way, is it possible to migrate towards to smaller token ranges? What 
> is the recommended way doing so?

 - Didn't see this question answered. I think, be easiest way to do this is to 
add new C* nodes with lower vnodes (8, 16 instead of default 256) then decom 
old nodes with vnodes=256.


Thanks, guys, for shedding some light on this Java multithread-related 
scalability issue. BTW how to understand from JVM / OS metrics that number of 
threads for a JVM becomes a bottleneck?

Also, I'd like to add a comment: the higher number of vnodes per a node the 
lower overall reliability of the cluster. Replicas for a token range  are 
placed on the nodes responsible for next+1, next+2  ranges  (not taking into 
account NetworkTopologyStrategy / Snitch which help but seemingly not so much 
expressing in terms of probabilities). The higher number of vnodes per a node, 
the higher probability all nodes in the cluster will become 'neighbors' in 
terms of token ranges.

It's not a trivial formula for 'reliability' of C* cluster [haven't got a 
chance to do math], but in general, having a bigger number of nodes in a 
cluster (like 100 or 200), probability of 2 or more nodes are down at the same 
time increases proportionally the the number of nodes.


The most reliable C* setup is using initial_token instead of vnodes.

But this makes manageability of C* cluster worse [not so automated + there will 
hotshots in the cluster in some cases].


Remark: for  C* cluster with RF=3 any number of nodes and initial_token/vnodes 
setup there is always a possibility that simultaneous unavailability of 2(or 3, 
depending on which CL is used) nodes will lead to unavailability of a token 
range ('HostUnavailable' exception).

No miracles: reliability is mostly determined by RF number.


Which task must be solved for large clusters: "Reliability of a cluster with 
NNN nodes and RF=3 shouldn't be 'tangibly' less than reliability of 3-nodes 
cluster with RF=3"


Kind Regards,

Kyrill


From: Jürgen Albersdorfer 
Sent: Tuesday, February 20, 2018 10:34:21 PM
To: user@cassandra.apache.org
Subject: Re: Is it possible / makes it sense to limit concurrent streaming 
during bootstrapping new nodes?

Thanks Jeff,
your answer is really not what I expected to learn - which is again more manual 
doing as soon as we start really using C*. But I‘m happy to be able to learn it 
now and have still time to learn the neccessary Skills and ask the right 
questions on how to correctly drive big data with C* until we actually start 
using it, and I‘m glad to have People like you around caring about this 
questions. Thanks. This still convinces me having bet on the right horse, even 
when it might become a rough ride.

By the way, is it possible to migrate towards to smaller token ranges? What is 
the recommended way doing so? And which number of nodes is the typical ‚break 
even‘?

Von meinem iPhone gesendet

Am 20.02.2018 um 21:05 schrieb Jeff Jirsa 
>:

The scenario you describe is the typical point where people move away from 
vnodes and towards single-token-per-node (or a much smaller number of vnodes).

The default setting puts you in a situation where virtually all hosts are 
adjacent/neighbors to all others (at least until you're way into the hundreds 
of hosts), which means you'll stream from nearly all hosts. If you drop the 
number of vnodes from ~256 to ~4 or ~8 or ~16, you'll see the number of streams 
drop as well.

Many people with "large" clusters statically allocate tokens to make it 
predictable - if you have a single token per host, you can add multiple hosts 
at a time, each streaming from a small number of neighbors, without overlap.

It takes a bit more tooling (or manual token calculation) outside of cassandra, 
but works well in practice for "large" clusters.




On Tue, Feb 20, 2018 at 4:42 AM, Jürgen Albersdorfer 
> wrote:
Hi, I'm wondering if it is possible resp. would it make sense to limit 
concurrent streaming when joining a new node to cluster.

I'm currently operating a 15-Node C* Cluster (V 3.11.1) and joining another 
Node every day.
The 'nodetool netstats' shows it always streams data from all other nodes.

How far will this scale? - What happens when I have hundrets or even thousends 
of Nodes?

Has anyone experience with such a Situation?

Thanks, and regards
Jürgen



Re: Is it possible / makes it sense to limit concurrent streaming during bootstrapping new nodes?

2018-02-20 Thread Jürgen Albersdorfer
We do archiving data in Order to make assumptions on it in future. So, yes we 
expect to grow continously. In the mean time I learned to go for predictable 
grow per partition rather than unpredictable large partitioning. So today we 
are growing 250.000.000 Records per Day going into a single table and heading 
towards to about 100 times that number this year. A Partition will grow one 
Record a Day, which should give us good horizontal scaleability, but means 
250.000.000 to 25.000.000.000 partitions. Hope this Numbers should not make me 
feel uncomfortable :)

Von meinem iPhone gesendet

> Am 20.02.2018 um 21:39 schrieb Jeff Jirsa :
> 
> At a past job, we set the limit at around 60 hosts per cluster - anything 
> bigger than that got single token. Anything smaller, and we'd just tolerate 
> the inconveniences of vnodes. But that was before the new vnode token 
> allocation went into 3.0, and really assumed things that may not be true for 
> you (it was a cluster that started at 60 hosts and grew up to 480 in steps, 
> so we'd want to grow quickly - having single token allowed us to grow from 
> 60-120 in 2 days, and then 120-180 in 2 days, and so on).
> 
> Are you always going to be growing, or is it a short/temporary thing?
> There are users of vnodes (at big, public companies) that go up into the 
> hundreds of nodes.
> 
> Most people running cassandra start sharding clusters rather than going past 
> a thousand or so nodes - I know there's at least one person I talked to in 
> IRC with a 1700 host cluster, but that'd be beyond what I'd ever do 
> personally.
> 
> 
> 
>> On Tue, Feb 20, 2018 at 12:34 PM, Jürgen Albersdorfer 
>>  wrote:
>> Thanks Jeff,
>> your answer is really not what I expected to learn - which is again more 
>> manual doing as soon as we start really using C*. But I‘m happy to be able 
>> to learn it now and have still time to learn the neccessary Skills and ask 
>> the right questions on how to correctly drive big data with C* until we 
>> actually start using it, and I‘m glad to have People like you around caring 
>> about this questions. Thanks. This still convinces me having bet on the 
>> right horse, even when it might become a rough ride.
>> 
>> By the way, is it possible to migrate towards to smaller token ranges? What 
>> is the recommended way doing so? And which number of nodes is the typical 
>> ‚break even‘?
>> 
>> Von meinem iPhone gesendet
>> 
>>> Am 20.02.2018 um 21:05 schrieb Jeff Jirsa :
>>> 
>>> The scenario you describe is the typical point where people move away from 
>>> vnodes and towards single-token-per-node (or a much smaller number of 
>>> vnodes).
>>> 
>>> The default setting puts you in a situation where virtually all hosts are 
>>> adjacent/neighbors to all others (at least until you're way into the 
>>> hundreds of hosts), which means you'll stream from nearly all hosts. If you 
>>> drop the number of vnodes from ~256 to ~4 or ~8 or ~16, you'll see the 
>>> number of streams drop as well.
>>> 
>>> Many people with "large" clusters statically allocate tokens to make it 
>>> predictable - if you have a single token per host, you can add multiple 
>>> hosts at a time, each streaming from a small number of neighbors, without 
>>> overlap.
>>> 
>>> It takes a bit more tooling (or manual token calculation) outside of 
>>> cassandra, but works well in practice for "large" clusters.
>>> 
>>> 
>>> 
>>> 
 On Tue, Feb 20, 2018 at 4:42 AM, Jürgen Albersdorfer 
  wrote:
 Hi, I'm wondering if it is possible resp. would it make sense to limit 
 concurrent streaming when joining a new node to cluster.
 
 I'm currently operating a 15-Node C* Cluster (V 3.11.1) and joining 
 another Node every day.
 The 'nodetool netstats' shows it always streams data from all other nodes.
 
 How far will this scale? - What happens when I have hundrets or even 
 thousends of Nodes?
 
 Has anyone experience with such a Situation?
 
 Thanks, and regards
 Jürgen
>>> 
> 


Re: Is it possible / makes it sense to limit concurrent streaming during bootstrapping new nodes?

2018-02-20 Thread Jeff Jirsa
At a past job, we set the limit at around 60 hosts per cluster - anything
bigger than that got single token. Anything smaller, and we'd just tolerate
the inconveniences of vnodes. But that was before the new vnode token
allocation went into 3.0, and really assumed things that may not be true
for you (it was a cluster that started at 60 hosts and grew up to 480 in
steps, so we'd want to grow quickly - having single token allowed us to
grow from 60-120 in 2 days, and then 120-180 in 2 days, and so on).

Are you always going to be growing, or is it a short/temporary thing?
There are users of vnodes (at big, public companies) that go up into the
hundreds of nodes.

Most people running cassandra start sharding clusters rather than going
past a thousand or so nodes - I know there's at least one person I talked
to in IRC with a 1700 host cluster, but that'd be beyond what I'd ever do
personally.



On Tue, Feb 20, 2018 at 12:34 PM, Jürgen Albersdorfer <
jalbersdor...@gmail.com> wrote:

> Thanks Jeff,
> your answer is really not what I expected to learn - which is again more
> manual doing as soon as we start really using C*. But I‘m happy to be able
> to learn it now and have still time to learn the neccessary Skills and ask
> the right questions on how to correctly drive big data with C* until we
> actually start using it, and I‘m glad to have People like you around caring
> about this questions. Thanks. This still convinces me having bet on the
> right horse, even when it might become a rough ride.
>
> By the way, is it possible to migrate towards to smaller token ranges?
> What is the recommended way doing so? And which number of nodes is the
> typical ‚break even‘?
>
> Von meinem iPhone gesendet
>
> Am 20.02.2018 um 21:05 schrieb Jeff Jirsa :
>
> The scenario you describe is the typical point where people move away from
> vnodes and towards single-token-per-node (or a much smaller number of
> vnodes).
>
> The default setting puts you in a situation where virtually all hosts are
> adjacent/neighbors to all others (at least until you're way into the
> hundreds of hosts), which means you'll stream from nearly all hosts. If you
> drop the number of vnodes from ~256 to ~4 or ~8 or ~16, you'll see the
> number of streams drop as well.
>
> Many people with "large" clusters statically allocate tokens to make it
> predictable - if you have a single token per host, you can add multiple
> hosts at a time, each streaming from a small number of neighbors, without
> overlap.
>
> It takes a bit more tooling (or manual token calculation) outside of
> cassandra, but works well in practice for "large" clusters.
>
>
>
>
> On Tue, Feb 20, 2018 at 4:42 AM, Jürgen Albersdorfer <
> jalbersdor...@gmail.com> wrote:
>
>> Hi, I'm wondering if it is possible resp. would it make sense to limit
>> concurrent streaming when joining a new node to cluster.
>>
>> I'm currently operating a 15-Node C* Cluster (V 3.11.1) and joining
>> another Node every day.
>> The 'nodetool netstats' shows it always streams data from all other nodes.
>>
>> How far will this scale? - What happens when I have hundrets or even
>> thousends of Nodes?
>>
>> Has anyone experience with such a Situation?
>>
>> Thanks, and regards
>> Jürgen
>>
>
>


Re: Is it possible / makes it sense to limit concurrent streaming during bootstrapping new nodes?

2018-02-20 Thread Jürgen Albersdorfer
Thanks Jeff,
your answer is really not what I expected to learn - which is again more manual 
doing as soon as we start really using C*. But I‘m happy to be able to learn it 
now and have still time to learn the neccessary Skills and ask the right 
questions on how to correctly drive big data with C* until we actually start 
using it, and I‘m glad to have People like you around caring about this 
questions. Thanks. This still convinces me having bet on the right horse, even 
when it might become a rough ride.

By the way, is it possible to migrate towards to smaller token ranges? What is 
the recommended way doing so? And which number of nodes is the typical ‚break 
even‘?

Von meinem iPhone gesendet

> Am 20.02.2018 um 21:05 schrieb Jeff Jirsa :
> 
> The scenario you describe is the typical point where people move away from 
> vnodes and towards single-token-per-node (or a much smaller number of vnodes).
> 
> The default setting puts you in a situation where virtually all hosts are 
> adjacent/neighbors to all others (at least until you're way into the hundreds 
> of hosts), which means you'll stream from nearly all hosts. If you drop the 
> number of vnodes from ~256 to ~4 or ~8 or ~16, you'll see the number of 
> streams drop as well.
> 
> Many people with "large" clusters statically allocate tokens to make it 
> predictable - if you have a single token per host, you can add multiple hosts 
> at a time, each streaming from a small number of neighbors, without overlap.
> 
> It takes a bit more tooling (or manual token calculation) outside of 
> cassandra, but works well in practice for "large" clusters.
> 
> 
> 
> 
>> On Tue, Feb 20, 2018 at 4:42 AM, Jürgen Albersdorfer 
>>  wrote:
>> Hi, I'm wondering if it is possible resp. would it make sense to limit 
>> concurrent streaming when joining a new node to cluster.
>> 
>> I'm currently operating a 15-Node C* Cluster (V 3.11.1) and joining another 
>> Node every day.
>> The 'nodetool netstats' shows it always streams data from all other nodes.
>> 
>> How far will this scale? - What happens when I have hundrets or even 
>> thousends of Nodes?
>> 
>> Has anyone experience with such a Situation?
>> 
>> Thanks, and regards
>> Jürgen
> 


Re: Is it possible / makes it sense to limit concurrent streaming during bootstrapping new nodes?

2018-02-20 Thread Jeff Jirsa
The scenario you describe is the typical point where people move away from
vnodes and towards single-token-per-node (or a much smaller number of
vnodes).

The default setting puts you in a situation where virtually all hosts are
adjacent/neighbors to all others (at least until you're way into the
hundreds of hosts), which means you'll stream from nearly all hosts. If you
drop the number of vnodes from ~256 to ~4 or ~8 or ~16, you'll see the
number of streams drop as well.

Many people with "large" clusters statically allocate tokens to make it
predictable - if you have a single token per host, you can add multiple
hosts at a time, each streaming from a small number of neighbors, without
overlap.

It takes a bit more tooling (or manual token calculation) outside of
cassandra, but works well in practice for "large" clusters.




On Tue, Feb 20, 2018 at 4:42 AM, Jürgen Albersdorfer <
jalbersdor...@gmail.com> wrote:

> Hi, I'm wondering if it is possible resp. would it make sense to limit
> concurrent streaming when joining a new node to cluster.
>
> I'm currently operating a 15-Node C* Cluster (V 3.11.1) and joining
> another Node every day.
> The 'nodetool netstats' shows it always streams data from all other nodes.
>
> How far will this scale? - What happens when I have hundrets or even
> thousends of Nodes?
>
> Has anyone experience with such a Situation?
>
> Thanks, and regards
> Jürgen
>


Re: Is it possible / makes it sense to limit concurrent streaming during bootstrapping new nodes?

2018-02-20 Thread Nicolas Guyomar
Yes you are right, it limit how much data a node will send while streaming
data (repair, boostrap etc) total to other node, so that is does not affec
this node performance.

Boostraping is initiated by the boostraping node itself, which determine,
based on his token, which nodes to ask data from, then it compute its
"streaming plan" and init every session at the same time  (look
at org.apache.cassandra.streaming.StreamResultFuture#init )

But I can see in the code  that connection for session are done
sequentially in a FixedPoolSize executor limited by the node number of
processor, so, if I understand correctly the code, this might be the limit
you might be looking for.

You should have at the same time a limited ongoing streaming session
because of that limit, but I have to admit that because of all the async
method/callback in the code I might be wrong :(


On 20 February 2018 at 14:08, Jürgen Albersdorfer 
wrote:

> Hi Nicolas,
> I have seen that ' stream_throughput_outbound_megabits_per_sec', but
> afaik this limits what each node will provide at a maximum.
> What I'm more concerned of is the vast amount of connections to handle and
> the concurrent threads of which at least two get started for every single
> streaming connection.
> I'm a former Java Developer and I know that Threads are expensive even to
> just have them sitting around. JVM is doing fine handling some hundreds of
> threads, but what about some thousands?
>
> And about the cleanup - we are currently massively scaling out a startup
> database. I thought of doing cleanup after that ramp-up phase when scaling
> slows down to maybe a node every 3 to 7 days in average.
> So for now I want to add 32 Machines at one after one and then care about
> the cleanup afterwards one by one.
>
> regards,
> Jürgen
>
> 2018-02-20 13:56 GMT+01:00 Nicolas Guyomar :
>
>> Hi Jurgen,
>>
>> stream_throughput_outbound_megabits_per_sec is the "given total
>> throughput in Mbps", so it does limit the "concurrent throughput" IMHO,
>> is it not what you are looking for?
>>
>> The only limits I can think of are :
>> - number of connection between every node and the one boostrapping
>> - number of pending compaction (especially if you have lots of
>> keyspace/table) that could lead to some JVM problem maybe ?
>>
>> Anyway, because while bootstrapping, a node is not accepting reads,
>> configuration like compactionthroughput, concurrentcompactor and
>> streamingthroughput can be set on the fly using nodetool, so you can
>> quickly ajust them
>>
>> Out of curiosity, do you run "nodetool cleanup" in parallel on every
>> nodes left after a boostrap, or do you spread the "cleanup load" ? I have
>> not seen yet one adding a node every day like this ;) have fun !
>>
>>
>>
>> On 20 February 2018 at 13:42, Jürgen Albersdorfer <
>> jalbersdor...@gmail.com> wrote:
>>
>>> Hi, I'm wondering if it is possible resp. would it make sense to limit
>>> concurrent streaming when joining a new node to cluster.
>>>
>>> I'm currently operating a 15-Node C* Cluster (V 3.11.1) and joining
>>> another Node every day.
>>> The 'nodetool netstats' shows it always streams data from all other
>>> nodes.
>>>
>>> How far will this scale? - What happens when I have hundrets or even
>>> thousends of Nodes?
>>>
>>> Has anyone experience with such a Situation?
>>>
>>> Thanks, and regards
>>> Jürgen
>>>
>>
>>
>


Re: Is it possible / makes it sense to limit concurrent streaming during bootstrapping new nodes?

2018-02-20 Thread Jürgen Albersdorfer
Hi Nicolas,
I have seen that ' stream_throughput_outbound_megabits_per_sec', but afaik
this limits what each node will provide at a maximum.
What I'm more concerned of is the vast amount of connections to handle and
the concurrent threads of which at least two get started for every single
streaming connection.
I'm a former Java Developer and I know that Threads are expensive even to
just have them sitting around. JVM is doing fine handling some hundreds of
threads, but what about some thousands?

And about the cleanup - we are currently massively scaling out a startup
database. I thought of doing cleanup after that ramp-up phase when scaling
slows down to maybe a node every 3 to 7 days in average.
So for now I want to add 32 Machines at one after one and then care about
the cleanup afterwards one by one.

regards,
Jürgen

2018-02-20 13:56 GMT+01:00 Nicolas Guyomar :

> Hi Jurgen,
>
> stream_throughput_outbound_megabits_per_sec is the "given total
> throughput in Mbps", so it does limit the "concurrent throughput" IMHO,
> is it not what you are looking for?
>
> The only limits I can think of are :
> - number of connection between every node and the one boostrapping
> - number of pending compaction (especially if you have lots of
> keyspace/table) that could lead to some JVM problem maybe ?
>
> Anyway, because while bootstrapping, a node is not accepting reads,
> configuration like compactionthroughput, concurrentcompactor and 
> streamingthroughput
> can be set on the fly using nodetool, so you can quickly ajust them
>
> Out of curiosity, do you run "nodetool cleanup" in parallel on every nodes
> left after a boostrap, or do you spread the "cleanup load" ? I have not
> seen yet one adding a node every day like this ;) have fun !
>
>
>
> On 20 February 2018 at 13:42, Jürgen Albersdorfer  > wrote:
>
>> Hi, I'm wondering if it is possible resp. would it make sense to limit
>> concurrent streaming when joining a new node to cluster.
>>
>> I'm currently operating a 15-Node C* Cluster (V 3.11.1) and joining
>> another Node every day.
>> The 'nodetool netstats' shows it always streams data from all other nodes.
>>
>> How far will this scale? - What happens when I have hundrets or even
>> thousends of Nodes?
>>
>> Has anyone experience with such a Situation?
>>
>> Thanks, and regards
>> Jürgen
>>
>
>


Re: Is it possible / makes it sense to limit concurrent streaming during bootstrapping new nodes?

2018-02-20 Thread Nicolas Guyomar
Hi Jurgen,

stream_throughput_outbound_megabits_per_sec is the "given total throughput
in Mbps", so it does limit the "concurrent throughput" IMHO, is it not what
you are looking for?

The only limits I can think of are :
- number of connection between every node and the one boostrapping
- number of pending compaction (especially if you have lots of
keyspace/table) that could lead to some JVM problem maybe ?

Anyway, because while bootstrapping, a node is not accepting reads,
configuration like compactionthroughput, concurrentcompactor and
streamingthroughput
can be set on the fly using nodetool, so you can quickly ajust them

Out of curiosity, do you run "nodetool cleanup" in parallel on every nodes
left after a boostrap, or do you spread the "cleanup load" ? I have not
seen yet one adding a node every day like this ;) have fun !



On 20 February 2018 at 13:42, Jürgen Albersdorfer 
wrote:

> Hi, I'm wondering if it is possible resp. would it make sense to limit
> concurrent streaming when joining a new node to cluster.
>
> I'm currently operating a 15-Node C* Cluster (V 3.11.1) and joining
> another Node every day.
> The 'nodetool netstats' shows it always streams data from all other nodes.
>
> How far will this scale? - What happens when I have hundrets or even
> thousends of Nodes?
>
> Has anyone experience with such a Situation?
>
> Thanks, and regards
> Jürgen
>