Re: Is it possible / makes it sense to limit concurrent streaming during bootstrapping new nodes?

2018-02-24 Thread Kyrylo Lebediev
<jonathan.had...@gmail.com> on behalf of Jon Haddad <j...@jonhaddad.com> Sent: Saturday, February 24, 2018 5:44:24 PM To: user@cassandra.apache.org Subject: Re: Is it possible / makes it sense to limit concurrent streaming during bootstrapping new nodes? You can’t migrate down that wa

Re: Is it possible / makes it sense to limit concurrent streaming during bootstrapping new nodes?

2018-02-24 Thread Jon Haddad
ption). >> No miracles: reliability is mostly determined by RF number. >> >> Which task must be solved for large clusters: "Reliability of a cluster with >> NNN nodes and RF=3 shouldn't be 'tangibly' less than reliability of 3-nodes >> cluster with RF=3" >>

Re: Is it possible / makes it sense to limit concurrent streaming during bootstrapping new nodes?

2018-02-24 Thread Jon Haddad
with RF=3" > > Kind Regards, > Kyrill > From: Jürgen Albersdorfer <jalbersdor...@gmail.com> > Sent: Tuesday, February 20, 2018 10:34:21 PM > To: user@cassandra.apache.org > Subject: Re: Is it possible / makes it sense to limit concurrent streaming > during bootst

Re: Is it possible / makes it sense to limit concurrent streaming during bootstrapping new nodes?

2018-02-24 Thread Kyrylo Lebediev
...@gmail.com> Sent: Tuesday, February 20, 2018 10:34:21 PM To: user@cassandra.apache.org Subject: Re: Is it possible / makes it sense to limit concurrent streaming during bootstrapping new nodes? Thanks Jeff, your answer is really not what I expected to learn - which is again more manual doing as s

Re: Is it possible / makes it sense to limit concurrent streaming during bootstrapping new nodes?

2018-02-20 Thread Jürgen Albersdorfer
We do archiving data in Order to make assumptions on it in future. So, yes we expect to grow continously. In the mean time I learned to go for predictable grow per partition rather than unpredictable large partitioning. So today we are growing 250.000.000 Records per Day going into a single

Re: Is it possible / makes it sense to limit concurrent streaming during bootstrapping new nodes?

2018-02-20 Thread Jeff Jirsa
At a past job, we set the limit at around 60 hosts per cluster - anything bigger than that got single token. Anything smaller, and we'd just tolerate the inconveniences of vnodes. But that was before the new vnode token allocation went into 3.0, and really assumed things that may not be true for

Re: Is it possible / makes it sense to limit concurrent streaming during bootstrapping new nodes?

2018-02-20 Thread Jürgen Albersdorfer
Thanks Jeff, your answer is really not what I expected to learn - which is again more manual doing as soon as we start really using C*. But I‘m happy to be able to learn it now and have still time to learn the neccessary Skills and ask the right questions on how to correctly drive big data with

Re: Is it possible / makes it sense to limit concurrent streaming during bootstrapping new nodes?

2018-02-20 Thread Jeff Jirsa
The scenario you describe is the typical point where people move away from vnodes and towards single-token-per-node (or a much smaller number of vnodes). The default setting puts you in a situation where virtually all hosts are adjacent/neighbors to all others (at least until you're way into the

Re: Is it possible / makes it sense to limit concurrent streaming during bootstrapping new nodes?

2018-02-20 Thread Nicolas Guyomar
Yes you are right, it limit how much data a node will send while streaming data (repair, boostrap etc) total to other node, so that is does not affec this node performance. Boostraping is initiated by the boostraping node itself, which determine, based on his token, which nodes to ask data from,

Re: Is it possible / makes it sense to limit concurrent streaming during bootstrapping new nodes?

2018-02-20 Thread Jürgen Albersdorfer
Hi Nicolas, I have seen that ' stream_throughput_outbound_megabits_per_sec', but afaik this limits what each node will provide at a maximum. What I'm more concerned of is the vast amount of connections to handle and the concurrent threads of which at least two get started for every single

Re: Is it possible / makes it sense to limit concurrent streaming during bootstrapping new nodes?

2018-02-20 Thread Nicolas Guyomar
Hi Jurgen, stream_throughput_outbound_megabits_per_sec is the "given total throughput in Mbps", so it does limit the "concurrent throughput" IMHO, is it not what you are looking for? The only limits I can think of are : - number of connection between every node and the one boostrapping - number

Is it possible / makes it sense to limit concurrent streaming during bootstrapping new nodes?

2018-02-20 Thread Jürgen Albersdorfer
Hi, I'm wondering if it is possible resp. would it make sense to limit concurrent streaming when joining a new node to cluster. I'm currently operating a 15-Node C* Cluster (V 3.11.1) and joining another Node every day. The 'nodetool netstats' shows it always streams data from all other nodes.