Re: [Discuss] num_tokens default in Cassandra 4.0

2020-07-08 Thread Jeremy Hanna
Just to close the loop on this,
https://issues.apache.org/jira/browse/CASSANDRA-13701 is getting tested
now.  The project testing will get updated to utilize the new defaults
(both num_tokens and using the new allocation algorithm by uncommenting
allocate_tokens_for_local_replication_factor: 3.  Jon did some
documentation on num_tokens on
https://cassandra.apache.org/doc/latest/getting_started/production.html#tokens
on a separate ticket he mentioned -
https://issues.apache.org/jira/browse/CASSANDRA-15600.  The new default in
Cassandra 4.0+ will be to use the new allocation algorithm with num_tokens:
16.  There is a note in the NEWS.txt about upgrading and bootstrapping.  It
is a lot of effort to change this once it is set, so hopefully new users
will be in a much better place out of the box.  Thanks everyone for your
efforts in this.

On Wed, Apr 1, 2020 at 4:28 PM Jeremy Hanna 
wrote:

> As discussed, let's go with 16.  Speaking with Anthony privately as well,
> I had forgotten that some of the analysis that Branimir had initially done
> on the skew and allocation may have been internal to DataStax so I should
> have mentioned that previously.  Thanks to Mick, Alex, and Anthony for
> doing this analysis and helping back the decision with data.  This will
> benefit many that start with Cassandra that don't know that 256 is a bad
> number and end up with a hard to change decision later.  I assigned myself
> to https://issues.apache.org/jira/browse/CASSANDRA-13701.  Thanks all.
>
> On Wed, Mar 11, 2020 at 6:02 AM Mick Semb Wever  wrote:
>
>>
>>
>> > I propose we drop it to 16 immediately.  I'll add the production docs
>> > in CASSANDRA-15618 with notes on token count, the reasons why you'd
>> want 1,
>> > 4, or 16.  As a follow up, if we can get a token simulation written we
>> can
>> > try all sorts of topologies with whatever token algorithms we want.
>> Once
>> > that simulation is written and we've got some reports we can revisit.
>>
>>
>> This works for me, for our first step forward.
>> Good docs will always empower users more than any default setting can!
>>
>> cheers,
>> Mick
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>
>>


Re: [Discuss] num_tokens default in Cassandra 4.0

2020-03-31 Thread Jeremy Hanna
As discussed, let's go with 16.  Speaking with Anthony privately as well, I
had forgotten that some of the analysis that Branimir had initially done on
the skew and allocation may have been internal to DataStax so I should have
mentioned that previously.  Thanks to Mick, Alex, and Anthony for doing
this analysis and helping back the decision with data.  This will benefit
many that start with Cassandra that don't know that 256 is a bad number and
end up with a hard to change decision later.  I assigned myself to
https://issues.apache.org/jira/browse/CASSANDRA-13701.  Thanks all.

On Wed, Mar 11, 2020 at 6:02 AM Mick Semb Wever  wrote:

>
>
> > I propose we drop it to 16 immediately.  I'll add the production docs
> > in CASSANDRA-15618 with notes on token count, the reasons why you'd want
> 1,
> > 4, or 16.  As a follow up, if we can get a token simulation written we
> can
> > try all sorts of topologies with whatever token algorithms we want.  Once
> > that simulation is written and we've got some reports we can revisit.
>
>
> This works for me, for our first step forward.
> Good docs will always empower users more than any default setting can!
>
> cheers,
> Mick
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [Discuss] num_tokens default in Cassandra 4.0

2020-03-10 Thread Mick Semb Wever


 
> I propose we drop it to 16 immediately.  I'll add the production docs
> in CASSANDRA-15618 with notes on token count, the reasons why you'd want 1,
> 4, or 16.  As a follow up, if we can get a token simulation written we can
> try all sorts of topologies with whatever token algorithms we want.  Once
> that simulation is written and we've got some reports we can revisit.


This works for me, for our first step forward.
Good docs will always empower users more than any default setting can!

cheers,
Mick

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [Discuss] num_tokens default in Cassandra 4.0

2020-03-09 Thread Jon Haddad
There's a lot going on here... hopefully I can respond to everything in a
coherent manner.

> Perhaps a simple way to avoid this is to update the random allocation
algorithm to re-generate tokens when the ranges created do not have a good
size distribution?

Instead of using random tokens for the first node, I think we'd be better
off picking a random initial token then using an even distribution around
the ring, using the first token as an offset.  The main benefit of random
is that we don't get collisions, not the distribution.  I haven't read
through the change in CASSANDRA-15600, maybe it addresses this problem
already, if so we can ignore my suggestion here.

> Clusters where we have used num_tokens 4 we have regretted.
> While we accept the validity and importance of the increased availability
provided by num_tokens 4, we have never seen or used it in practice.

While we worked together, I personally moved quite a few clusters to 4
tokens, and didn't run into any balance issues.  I'm not sure why you're
saying you've never seen it in practice, I did it with a whole bunch of our
clients.

Mick Said:

> We know of a number of production clusters that have been set up this
way. I am unaware of any Cassandra docs or community recommendations that
say you should avoid doing this. So, this is a problem regardless of the
value for num_tokens.

Paulo:

> Having the number of racks not a multiple of the replication factor is
not a good practice since it can lead to imbalance and other problems like
this, so we should not only document this but perhaps add a warning or even
hard fail when this is encountered during node startup?

Agreed on both the above - I intend to document this in CASSANDRA-15618.

Mick, from your test:

>  Each cluster was configured with one rack.

This is an important nuance of the results you're seeing.  It sounds like
the test covers the edge case of using a single rack / AZ for an entire
cluster.  I can't remember too many times where I actually saw this, of the
several hundred clusters I looked at over the almost 4 years I was at TLP.
   This isn't to say it's not out there in the wild, but I don't think it
should drive us to pick a token count.  We can probably do better than
using a completely random algorithm for the corner case of using a single
rack or fewer racks than RF, and we should also encourage people to run
Cassandra in a way that doesn't set themselves up for a gunshot to the foot.

In a world of tradeoffs, I'm still not convinced that 16 tokens makes any
sense as a default.  Assuming we can fix the worst case random imbalance in
small clusters, 4 is a significantly better option as it will make it
easier for teams to scale Cassandra out the way we claim they can.  Using
16 tokens brings an unnecessary (and probably unknown) ceiling to people's
abilities to scale and for the *majority* of clusters where people pick
Cassandra for scalability and availability it's still too high.  I'd rather
us put a default that works best for the majority of people and document
the cases where people might want to deviate from it, rather than picking a
somewhat crappy (but better than 256) default.

That said, we don't have the better token distribution yet, so if we're
going to assume people just put C* in production with minimal configuration
changes, 16 will help us deal with the imbalance issues *today*.  We know
it works better than 256, so I'm willing to take this as a win *today*, on
the assumption that folks are OK changing this value again before we
release 4.0 if we find we can make it work without the super sharp edges
that we can currently stab ourselves with.  I'd much rather ship C* with 16
tokens than 256, and I don't want to keep debating this so much we don't
end up making any change at all.

I propose we drop it to 16 immediately.  I'll add the production docs
in CASSANDRA-15618 with notes on token count, the reasons why you'd want 1,
4, or 16.  As a follow up, if we can get a token simulation written we can
try all sorts of topologies with whatever token algorithms we want.  Once
that simulation is written and we've got some reports we can revisit.

Eventually we'll probably need to add the ability for folks to fix cluster
imbalances without adding / removing hardware, but I suspect we've got a
fair amount of plumbing to rework to make something like that doable.

Jon


On Mon, Mar 9, 2020 at 5:03 AM Paulo Motta  wrote:

> Great investigation, good job guys!
>
> > Personally I would have liked to have seen even more iterations. While
> 14 run iterations gives an indication, the average of randomness is not
> what is important here. What concerns me is the consequence to imbalances
> as the cluster grows when you're very unlucky with initial random tokens,
> for example when random tokens land very close together. The token
> allocation can deal with breaking up large token ranges but is unable to do
> anything about such tiny token ranges. Even a bad 1-in-a-100 

Re: [Discuss] num_tokens default in Cassandra 4.0

2020-03-09 Thread Paulo Motta
Great investigation, good job guys!

> Personally I would have liked to have seen even more iterations. While 14
run iterations gives an indication, the average of randomness is not what
is important here. What concerns me is the consequence to imbalances as the
cluster grows when you're very unlucky with initial random tokens, for
example when random tokens land very close together. The token allocation
can deal with breaking up large token ranges but is unable to do anything
about such tiny token ranges. Even a bad 1-in-a-100 experience should be a
consideration when picking a default num_tokens.

Perhaps a simple way to avoid this is to update the random allocation
algorithm to re-generate tokens when the ranges created do not have a good
size distribution?

> But it can be worse, for example if you have RF=3 and only two racks then
you will only get random tokens. We know of a number of production clusters
that have been set up this way. I am unaware of any Cassandra docs or
community recommendations that say you should avoid doing this. So, this is
a problem regardless of the value for num_tokens.

Having the number of racks not a multiple of the replication factor is not
a good practice since it can lead to imbalance and other problems like
this, so we should not only document this but perhaps add a warning or even
hard fail when this is encountered during node startup?

Cheers,

Paulo

Em seg., 9 de mar. de 2020 às 08:25, Mick Semb Wever 
escreveu:

>
> > Can we ask for some analysis and data against the risks different
> > num_tokens choices present. We shouldn't rush into a new default, and
> such
> > background information and data is operator value added.
>
>
> Thanks for everyone's patience on this topic.
> The following is further input on a number of fronts.
>
>
> ** Analysis of Token Distributions
>
> The following is work done by Alex Dejanovski and Anthony Grasso. It
> builds upon their previous work at The Last Pickle and why we recommend 16
> as the best value to clients. (Please buy beers for these two for the
> effort they have done here.)
>
> The following three graphs show the ranges of imbalance that occur on
> clusters growing from 4 nodes to 12 nodes, for the different values of
> num_tokens: 4, 8 and 16. The range is based on 14 run iterations (except 16
> which only got ten).
>
>
> num_tokens: 4
>
>
> num_tokens: 8
>
>
> num_tokens: 16
>
> These graphs were generated using clusters created in AWS by tlp-cluster (
> https://github.com/thelastpickle/tlp-cluster). A script was written to
> automate the testing and generate the data for each value of num_tokens.
> Each cluster was configured with one rack.  Of course these interpretations
> are debatable. The data to the graphs is in
> https://docs.google.com/spreadsheets/d/1gPZpSOUm3_pSCo9y-ZJ8WIctpvXNr5hDdupJ7K_9PHY/edit?usp=sharing
>
>
> What I see from these graphs is…
>  a)  token allocation is pretty good are fixing initial bad random token
> imbalances. By the time you are at 12 nodes, presuming you have setup the
> cluster correctly so that token allocation actually works, your nodes will
> be balanced with num_tokens 4 or greater.
>  b) you need to get to ~12 nodes with num_tokens 4 to have a good balance.
>  c) you need to get to ~9 nodes with num_token 8 to have a good balance.
>  d) you need to get to ~6 nodes with num_tokens 16 to have a good balance.
>
> Personally I would have liked to have seen even more iterations. While 14
> run iterations gives an indication, the average of randomness is not what
> is important here. What concerns me is the consequence to imbalances as the
> cluster grows when you're very unlucky with initial random tokens, for
> example when random tokens land very close together. The token allocation
> can deal with breaking up large token ranges but is unable to do anything
> about such tiny token ranges. Even a bad 1-in-a-100 experience should be a
> consideration when picking a default num_tokens.
>
>
> ** When does the Token Allocation work…
>
> This has been touched on already in this thread. There are cases where
> token allocation fails to kick in. The first node in up to RF racks
> generates random tokens, this typically means the first three nodes.
>
> But it can be worse, for example if you have RF=3 and only two racks then
> you will only get random tokens. We know of a number of production clusters
> that have been set up this way. I am unaware of any Cassandra docs or
> community recommendations that say you should avoid doing this. So, this is
> a problem regardless of the value for num_tokens.
>
>
> ** Algorithmic token allocation does not handle the racks = RF case well (
> CASSANDRA-15600 )
>
> This recently landed in trunk. My understanding is that this improves the
> situation the graphs cover, but not the situation just described where a DC
> has 1>racks>RF.  Ekaterina, maybe you could elaborate?
>
>
> ** Decommissioning 

Re: [Discuss] num_tokens default in Cassandra 4.0

2020-02-21 Thread Mick Semb Wever
The appeal to 'perfect is the enemy...' is appreciated. But I (we) have
seen from experiences that this is about what is good rather than what is
perfect.

I'm not suggesting we create a fool proof system, just one that is safe
against what we know happens all too often in production systems.

I believe there is some further analysis and testing happening, so I'm only
asking that we take a bit of patience so that our definition of what is
"good" (vs perfect) is grounded.


On Wed., 19 Feb. 2020, 1:35 pm Jeremiah Jordan, 
wrote:

> If you don’t know what you are doing you will have one rack which will
> also be safe. If you are setting up racks then you most likely read
> something about doing that, and should also be fine.
> This discussion has gone off the rails 100 times with what ifs that are
> “letting perfect be the enemy of good”. The setting doesn’t need to be
> perfect. It just needs to be “good enough“.
>
> > On Feb 19, 2020, at 1:44 AM, Mick Semb Wever 
> wrote:
> >
> > Why do we have to assume random assignment?
> >
> >
> >
> > Because token allocation only works once you have a node in RF racks. If
> > you don't bootstrap nodes in alternating racks, or just never have RF
> racks
> > setup (but more than one rack) it's going to be random.
> >
> > Whatever default we choose should be a safe choice, not the best for
> > experts. Making it safe (4 as the default would be great) shouldn't be
> > difficult, and I thought Joey was building a  list of related issues?
> >
> > Seeing these issues put together summarised would really help build the
> > consensus IMHO.
> >
> >>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [Discuss] num_tokens default in Cassandra 4.0

2020-02-19 Thread Jon Haddad
Joey Lynch had a good idea - that if the allocate tokens for RF isn't set
we use 1 as the RF.  I suggested we take it a step further and use the rack
count as the RF if it's not set.

This should take care of most clusters even if they don't set the RF, and
will handle the uneven distribution when provisioning a new cluster.

The only case where you'd want more tokens is to scale down, which I saw in
very few clusters of the hundreds I've worked on.



On Wed, Feb 19, 2020 at 4:35 AM Jeremiah Jordan 
wrote:

> If you don’t know what you are doing you will have one rack which will
> also be safe. If you are setting up racks then you most likely read
> something about doing that, and should also be fine.
> This discussion has gone off the rails 100 times with what ifs that are
> “letting perfect be the enemy of good”. The setting doesn’t need to be
> perfect. It just needs to be “good enough“.
>
> > On Feb 19, 2020, at 1:44 AM, Mick Semb Wever 
> wrote:
> >
> > Why do we have to assume random assignment?
> >
> >
> >
> > Because token allocation only works once you have a node in RF racks. If
> > you don't bootstrap nodes in alternating racks, or just never have RF
> racks
> > setup (but more than one rack) it's going to be random.
> >
> > Whatever default we choose should be a safe choice, not the best for
> > experts. Making it safe (4 as the default would be great) shouldn't be
> > difficult, and I thought Joey was building a  list of related issues?
> >
> > Seeing these issues put together summarised would really help build the
> > consensus IMHO.
> >
> >>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [Discuss] num_tokens default in Cassandra 4.0

2020-02-19 Thread Jeremiah Jordan
If you don’t know what you are doing you will have one rack which will also be 
safe. If you are setting up racks then you most likely read something about 
doing that, and should also be fine.
This discussion has gone off the rails 100 times with what ifs that are 
“letting perfect be the enemy of good”. The setting doesn’t need to be perfect. 
It just needs to be “good enough“.

> On Feb 19, 2020, at 1:44 AM, Mick Semb Wever  wrote:
> 
> Why do we have to assume random assignment?
> 
> 
> 
> Because token allocation only works once you have a node in RF racks. If
> you don't bootstrap nodes in alternating racks, or just never have RF racks
> setup (but more than one rack) it's going to be random.
> 
> Whatever default we choose should be a safe choice, not the best for
> experts. Making it safe (4 as the default would be great) shouldn't be
> difficult, and I thought Joey was building a  list of related issues?
> 
> Seeing these issues put together summarised would really help build the
> consensus IMHO.
> 
>> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [Discuss] num_tokens default in Cassandra 4.0

2020-02-18 Thread Jeremiah D Jordan
+1 for 8 + algorithm assignment being the default.

Why do we have to assume random assignment?  If someone turns off algorithm 
assignment they are changing away from defaults, so they should also adjust the 
num tokens.

-Jeremiah

> On Feb 18, 2020, at 1:44 AM, Mick Semb Wever  wrote:
> 
> -1
> 
> Discussions here and on slack have brought up a number of important
> concerns. I think those concerns need to be summarised here before any
> informal vote.
> 
> It was my understanding that some of those concerns may even be blockers to
> a move to 16. That is we have to presume the worse case scenario where all
> tokens get randomly generated.
> 
> Can we ask for some analysis and data against the risks different
> num_tokens choices present. We shouldn't rush into a new default, and such
> background information and data is operator value added. Maybe I missed any
> info/experiments that have happened?
> 
> 
> 
> On Mon., 17 Feb. 2020, 11:14 pm Jeremy Hanna, 
> wrote:
> 
>> I just wanted to close the loop on this if possible.  After some discussion
>> in slack about various topics, I would like to see if people are okay with
>> num_tokens=8 by default (as it's not much different operationally than
>> 16).  Joey brought up a few small changes that I can put on the ticket.  It
>> also requires some documentation for things like decommission order and
>> skew.
>> 
>> Are people okay with this change moving forward like this?  If so, I'll
>> comment on the ticket and we can move forward.
>> 
>> Thanks,
>> 
>> Jeremy
>> 
>> On Tue, Feb 4, 2020 at 8:45 AM Jon Haddad  wrote:
>> 
>>> I think it's a good idea to take a step back and get a high level view of
>>> the problem we're trying to solve.
>>> 
>>> First, high token counts result in decreased availability as each node
>> has
>>> data overlap with with more nodes in the cluster.  Specifically, a node
>> can
>>> share data with RF-1 * 2 * num_tokens.  So a 256 token cluster at RF=3 is
>>> going to almost always share data with every other node in the cluster
>> that
>>> isn't in the same rack, unless you're doing something wild like using
>> more
>>> than a thousand nodes in a cluster.  We advertise
>>> 
>>> With 16 tokens, that is vastly improved, but you still have up to 64
>> nodes
>>> each node needs to query against, so you're again, hitting every node
>>> unless you go above ~96 nodes in the cluster (assuming 3 racks / AZs).  I
>>> wouldn't use 16 here, and I doubt any of you would either.  I've
>> advocated
>>> for 4 tokens because you'd have overlap with only 16 nodes, which works
>>> well for small clusters as well as large.  Assuming I was creating a new
>>> cluster for myself (in a hypothetical brand new application I'm
>> building) I
>>> would put this in production.  I have worked with several teams where I
>>> helped them put 4 token clusters in prod and it has worked very well.  We
>>> didn't see any wild imbalance issues.
>>> 
>>> As Mick's pointed out, our current method of using random token
>> assignment
>>> for the default number of problematic for 4 tokens.  I fully agree with
>>> this, and I think if we were to try to use 4 tokens, we'd want to address
>>> this in tandem.  We can discuss how to better allocate tokens by default
>>> (something more predictable than random), but I'd like to avoid the
>>> specifics of that for the sake of this email.
>>> 
>>> To Alex's point, repairs are problematic with lower token counts due to
>>> over streaming.  I think this is a pretty serious issue and I we'd have
>> to
>>> address it before going all the way down to 4.  This, in my opinion, is a
>>> more complex problem to solve and I think trying to fix it here could
>> make
>>> shipping 4.0 take even longer, something none of us want.
>>> 
>>> For the sake of shipping 4.0 without adding extra overhead and time, I'm
>> ok
>>> with moving to 16 tokens, and in the process adding extensive
>> documentation
>>> outlining what we recommend for production use.  I think we should also
>> try
>>> to figure out something better than random as the default to fix the data
>>> imbalance issues.  I've got a few ideas here I've been noodling on.
>>> 
>>> As long as folks are fine with potentially changing the default again in
>> C*
>>> 5.0 (after another discussion / debate), 16 is enough of an improvement
>>> that I'm OK with the change, and willing to author the docs to help
>> people
>>> set up their first cluster.  For folks that go into production with the
>>> defaults, we're at least not setting them up for total failure once their
>>> clusters get large like we are now.
>>> 
>>> In future versions, we'll probably want to address the issue of data
>>> imbalance by building something in that shifts individual tokens
>> around.  I
>>> don't think we should try to do this in 4.0 either.
>>> 
>>> Jon
>>> 
>>> 
>>> 
>>> On Fri, Jan 31, 2020 at 2:04 PM Jeremy Hanna >> 
>>> wrote:
>>> 
 I think Mick and Anthony make some valid operational and skew points
>> 

Re: [Discuss] num_tokens default in Cassandra 4.0

2020-02-18 Thread Joshua McKenzie
>
> Discussions here and on slack have brought up a number of important
> concerns.

Sounds like we're letting the perfect be the enemy of the good. Is anyone
arguing that 256 is a better default than 16? Or is the fear that going to
16 now would make a default change in, say, 5.0 more painful?


On Tue, Feb 18, 2020 at 3:12 AM Ben Slater 
wrote:

> In case it helps move the decision along, we moved to 16 vnodes as default
> in Nov 2018 and haven't looked back (many clusters from 3-100s of nodes
> later). The testing we did in making that decision is summarised here:
> https://www.instaclustr.com/cassandra-vnodes-how-many-should-i-use/
>
>  >Cheers
> Ben
>
> ---
>
>
> *Ben Slater**Chief Product Officer*
>
> 
>
>    
> 
>
> Read our latest technical blog posts here
> .
>
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>
>
> On Tue, 18 Feb 2020 at 18:44, Mick Semb Wever 
> wrote:
>
> > -1
> >
> > Discussions here and on slack have brought up a number of important
> > concerns. I think those concerns need to be summarised here before any
> > informal vote.
> >
> > It was my understanding that some of those concerns may even be blockers
> to
> > a move to 16. That is we have to presume the worse case scenario where
> all
> > tokens get randomly generated.
> >
> > Can we ask for some analysis and data against the risks different
> > num_tokens choices present. We shouldn't rush into a new default, and
> such
> > background information and data is operator value added. Maybe I missed
> any
> > info/experiments that have happened?
> >
> >
> >
> > On Mon., 17 Feb. 2020, 11:14 pm Jeremy Hanna, <
> jeremy.hanna1...@gmail.com>
> > wrote:
> >
> > > I just wanted to close the loop on this if possible.  After some
> > discussion
> > > in slack about various topics, I would like to see if people are okay
> > with
> > > num_tokens=8 by default (as it's not much different operationally than
> > > 16).  Joey brought up a few small changes that I can put on the ticket.
> > It
> > > also requires some documentation for things like decommission order and
> > > skew.
> > >
> > > Are people okay with this change moving forward like this?  If so, I'll
> > > comment on the ticket and we can move forward.
> > >
> > > Thanks,
> > >
> > > Jeremy
> > >
> > > On Tue, Feb 4, 2020 at 8:45 AM Jon Haddad  wrote:
> > >
> > > > I think it's a good idea to take a step back and get a high level
> view
> > of
> > > > the problem we're trying to solve.
> > > >
> > > > First, high token counts result in decreased availability as each
> node
> > > has
> > > > data overlap with with more nodes in the cluster.  Specifically, a
> node
> > > can
> > > > share data with RF-1 * 2 * num_tokens.  So a 256 token cluster at
> RF=3
> > is
> > > > going to almost always share data with every other node in the
> cluster
> > > that
> > > > isn't in the same rack, unless you're doing something wild like using
> > > more
> > > > than a thousand nodes in a cluster.  We advertise
> > > >
> > > > With 16 tokens, that is vastly improved, but you still have up to 64
> > > nodes
> > > > each node needs to query against, so you're again, hitting every node
> > > > unless you go above ~96 nodes in the cluster (assuming 3 racks /
> > AZs).  I
> > > > wouldn't use 16 here, and I doubt any of you would either.  I've
> > > advocated
> > > > for 4 tokens because you'd have overlap with only 16 nodes, which
> works
> > > > well for small clusters as well as large.  Assuming I was creating a
> > new
> > > > cluster for myself (in a hypothetical brand new application I'm
> > > building) I
> > > > would put this in production.  I have worked with several teams
> where I
> > > > helped them put 4 token clusters in prod and it has worked very well.
> > We
> > > > didn't see any wild imbalance issues.
> > > >
> > > > As Mick's pointed out, our current method of using random token
> > > assignment
> > > > for the default number of problematic for 4 tokens.  I fully agree
> with
> > > > this, and I think if we were to try to use 4 tokens, we'd want to
> > address
> > > > this in tandem.  We can discuss how to better allocate tokens by
> > default
> > > > (something more predictable than random), but I'd like to avoid the
> > > > specifics of that for the sake of this email.
> > > >
> > > > To Alex's point, repairs are problematic with lower token counts due
> to
> > > > over 

Re: [Discuss] num_tokens default in Cassandra 4.0

2020-02-18 Thread Ben Slater
In case it helps move the decision along, we moved to 16 vnodes as default
in Nov 2018 and haven't looked back (many clusters from 3-100s of nodes
later). The testing we did in making that decision is summarised here:
https://www.instaclustr.com/cassandra-vnodes-how-many-should-i-use/

Cheers
Ben

---


*Ben Slater**Chief Product Officer*



   


Read our latest technical blog posts here
.

This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.


On Tue, 18 Feb 2020 at 18:44, Mick Semb Wever 
wrote:

> -1
>
> Discussions here and on slack have brought up a number of important
> concerns. I think those concerns need to be summarised here before any
> informal vote.
>
> It was my understanding that some of those concerns may even be blockers to
> a move to 16. That is we have to presume the worse case scenario where all
> tokens get randomly generated.
>
> Can we ask for some analysis and data against the risks different
> num_tokens choices present. We shouldn't rush into a new default, and such
> background information and data is operator value added. Maybe I missed any
> info/experiments that have happened?
>
>
>
> On Mon., 17 Feb. 2020, 11:14 pm Jeremy Hanna, 
> wrote:
>
> > I just wanted to close the loop on this if possible.  After some
> discussion
> > in slack about various topics, I would like to see if people are okay
> with
> > num_tokens=8 by default (as it's not much different operationally than
> > 16).  Joey brought up a few small changes that I can put on the ticket.
> It
> > also requires some documentation for things like decommission order and
> > skew.
> >
> > Are people okay with this change moving forward like this?  If so, I'll
> > comment on the ticket and we can move forward.
> >
> > Thanks,
> >
> > Jeremy
> >
> > On Tue, Feb 4, 2020 at 8:45 AM Jon Haddad  wrote:
> >
> > > I think it's a good idea to take a step back and get a high level view
> of
> > > the problem we're trying to solve.
> > >
> > > First, high token counts result in decreased availability as each node
> > has
> > > data overlap with with more nodes in the cluster.  Specifically, a node
> > can
> > > share data with RF-1 * 2 * num_tokens.  So a 256 token cluster at RF=3
> is
> > > going to almost always share data with every other node in the cluster
> > that
> > > isn't in the same rack, unless you're doing something wild like using
> > more
> > > than a thousand nodes in a cluster.  We advertise
> > >
> > > With 16 tokens, that is vastly improved, but you still have up to 64
> > nodes
> > > each node needs to query against, so you're again, hitting every node
> > > unless you go above ~96 nodes in the cluster (assuming 3 racks /
> AZs).  I
> > > wouldn't use 16 here, and I doubt any of you would either.  I've
> > advocated
> > > for 4 tokens because you'd have overlap with only 16 nodes, which works
> > > well for small clusters as well as large.  Assuming I was creating a
> new
> > > cluster for myself (in a hypothetical brand new application I'm
> > building) I
> > > would put this in production.  I have worked with several teams where I
> > > helped them put 4 token clusters in prod and it has worked very well.
> We
> > > didn't see any wild imbalance issues.
> > >
> > > As Mick's pointed out, our current method of using random token
> > assignment
> > > for the default number of problematic for 4 tokens.  I fully agree with
> > > this, and I think if we were to try to use 4 tokens, we'd want to
> address
> > > this in tandem.  We can discuss how to better allocate tokens by
> default
> > > (something more predictable than random), but I'd like to avoid the
> > > specifics of that for the sake of this email.
> > >
> > > To Alex's point, repairs are problematic with lower token counts due to
> > > over streaming.  I think this is a pretty serious issue and I we'd have
> > to
> > > address it before going all the way down to 4.  This, in my opinion,
> is a
> > > more complex problem to solve and I think trying to fix it here could
> > make
> > > shipping 4.0 take even longer, something none of us want.
> > >
> > > For the sake of shipping 4.0 without adding extra overhead and time,
> I'm
> > ok
> > > with moving to 16 tokens, and in the process adding extensive
> > documentation
> > > outlining what we recommend for production use.  I think we should also
> > try
> > > to figure out something better than random as the 

Re: [Discuss] num_tokens default in Cassandra 4.0

2020-02-17 Thread Mick Semb Wever
-1

Discussions here and on slack have brought up a number of important
concerns. I think those concerns need to be summarised here before any
informal vote.

It was my understanding that some of those concerns may even be blockers to
a move to 16. That is we have to presume the worse case scenario where all
tokens get randomly generated.

Can we ask for some analysis and data against the risks different
num_tokens choices present. We shouldn't rush into a new default, and such
background information and data is operator value added. Maybe I missed any
info/experiments that have happened?



On Mon., 17 Feb. 2020, 11:14 pm Jeremy Hanna, 
wrote:

> I just wanted to close the loop on this if possible.  After some discussion
> in slack about various topics, I would like to see if people are okay with
> num_tokens=8 by default (as it's not much different operationally than
> 16).  Joey brought up a few small changes that I can put on the ticket.  It
> also requires some documentation for things like decommission order and
> skew.
>
> Are people okay with this change moving forward like this?  If so, I'll
> comment on the ticket and we can move forward.
>
> Thanks,
>
> Jeremy
>
> On Tue, Feb 4, 2020 at 8:45 AM Jon Haddad  wrote:
>
> > I think it's a good idea to take a step back and get a high level view of
> > the problem we're trying to solve.
> >
> > First, high token counts result in decreased availability as each node
> has
> > data overlap with with more nodes in the cluster.  Specifically, a node
> can
> > share data with RF-1 * 2 * num_tokens.  So a 256 token cluster at RF=3 is
> > going to almost always share data with every other node in the cluster
> that
> > isn't in the same rack, unless you're doing something wild like using
> more
> > than a thousand nodes in a cluster.  We advertise
> >
> > With 16 tokens, that is vastly improved, but you still have up to 64
> nodes
> > each node needs to query against, so you're again, hitting every node
> > unless you go above ~96 nodes in the cluster (assuming 3 racks / AZs).  I
> > wouldn't use 16 here, and I doubt any of you would either.  I've
> advocated
> > for 4 tokens because you'd have overlap with only 16 nodes, which works
> > well for small clusters as well as large.  Assuming I was creating a new
> > cluster for myself (in a hypothetical brand new application I'm
> building) I
> > would put this in production.  I have worked with several teams where I
> > helped them put 4 token clusters in prod and it has worked very well.  We
> > didn't see any wild imbalance issues.
> >
> > As Mick's pointed out, our current method of using random token
> assignment
> > for the default number of problematic for 4 tokens.  I fully agree with
> > this, and I think if we were to try to use 4 tokens, we'd want to address
> > this in tandem.  We can discuss how to better allocate tokens by default
> > (something more predictable than random), but I'd like to avoid the
> > specifics of that for the sake of this email.
> >
> > To Alex's point, repairs are problematic with lower token counts due to
> > over streaming.  I think this is a pretty serious issue and I we'd have
> to
> > address it before going all the way down to 4.  This, in my opinion, is a
> > more complex problem to solve and I think trying to fix it here could
> make
> > shipping 4.0 take even longer, something none of us want.
> >
> > For the sake of shipping 4.0 without adding extra overhead and time, I'm
> ok
> > with moving to 16 tokens, and in the process adding extensive
> documentation
> > outlining what we recommend for production use.  I think we should also
> try
> > to figure out something better than random as the default to fix the data
> > imbalance issues.  I've got a few ideas here I've been noodling on.
> >
> > As long as folks are fine with potentially changing the default again in
> C*
> > 5.0 (after another discussion / debate), 16 is enough of an improvement
> > that I'm OK with the change, and willing to author the docs to help
> people
> > set up their first cluster.  For folks that go into production with the
> > defaults, we're at least not setting them up for total failure once their
> > clusters get large like we are now.
> >
> > In future versions, we'll probably want to address the issue of data
> > imbalance by building something in that shifts individual tokens
> around.  I
> > don't think we should try to do this in 4.0 either.
> >
> > Jon
> >
> >
> >
> > On Fri, Jan 31, 2020 at 2:04 PM Jeremy Hanna  >
> > wrote:
> >
> > > I think Mick and Anthony make some valid operational and skew points
> for
> > > smaller/starting clusters with 4 num_tokens. There’s an arbitrary line
> > > between small and large clusters but I think most would agree that most
> > > clusters are on the small to medium side. (A small nuance is afaict the
> > > probabilities have to do with quorum on a full token range, ie it has
> to
> > do
> > > with the size of a datacenter not the full cluster
> 

Re: [Discuss] num_tokens default in Cassandra 4.0

2020-02-17 Thread Rahul Singh
+1 on 8

rahul.xavier.si...@gmail.com

http://cassandra.link
The Apache Cassandra Knowledge Base.
On Feb 17, 2020, 5:20 PM -0500, Erick Ramirez , 
wrote:
> +1 on 8 tokens. I'd personally like us to be able to move this along pretty
> quickly as it's confusing for users looking for direction. Cheers!
>
> On Tue, 18 Feb 2020, 9:14 am Jeremy Hanna, 
> wrote:
>
> > I just wanted to close the loop on this if possible. After some discussion
> > in slack about various topics, I would like to see if people are okay with
> > num_tokens=8 by default (as it's not much different operationally than
> > 16). Joey brought up a few small changes that I can put on the ticket. It
> > also requires some documentation for things like decommission order and
> > skew.
> >
> > Are people okay with this change moving forward like this? If so, I'll
> > comment on the ticket and we can move forward.
> >
> > Thanks,
> >
> > Jeremy
> >


Re: [Discuss] num_tokens default in Cassandra 4.0

2020-02-17 Thread Erick Ramirez
+1 on 8 tokens. I'd personally like us to be able to move this along pretty
quickly as it's confusing for users looking for direction. Cheers!

On Tue, 18 Feb 2020, 9:14 am Jeremy Hanna, 
wrote:

> I just wanted to close the loop on this if possible.  After some discussion
> in slack about various topics, I would like to see if people are okay with
> num_tokens=8 by default (as it's not much different operationally than
> 16).  Joey brought up a few small changes that I can put on the ticket.  It
> also requires some documentation for things like decommission order and
> skew.
>
> Are people okay with this change moving forward like this?  If so, I'll
> comment on the ticket and we can move forward.
>
> Thanks,
>
> Jeremy
>


Re: [Discuss] num_tokens default in Cassandra 4.0

2020-02-03 Thread Jon Haddad
I think it's a good idea to take a step back and get a high level view of
the problem we're trying to solve.

First, high token counts result in decreased availability as each node has
data overlap with with more nodes in the cluster.  Specifically, a node can
share data with RF-1 * 2 * num_tokens.  So a 256 token cluster at RF=3 is
going to almost always share data with every other node in the cluster that
isn't in the same rack, unless you're doing something wild like using more
than a thousand nodes in a cluster.  We advertise

With 16 tokens, that is vastly improved, but you still have up to 64 nodes
each node needs to query against, so you're again, hitting every node
unless you go above ~96 nodes in the cluster (assuming 3 racks / AZs).  I
wouldn't use 16 here, and I doubt any of you would either.  I've advocated
for 4 tokens because you'd have overlap with only 16 nodes, which works
well for small clusters as well as large.  Assuming I was creating a new
cluster for myself (in a hypothetical brand new application I'm building) I
would put this in production.  I have worked with several teams where I
helped them put 4 token clusters in prod and it has worked very well.  We
didn't see any wild imbalance issues.

As Mick's pointed out, our current method of using random token assignment
for the default number of problematic for 4 tokens.  I fully agree with
this, and I think if we were to try to use 4 tokens, we'd want to address
this in tandem.  We can discuss how to better allocate tokens by default
(something more predictable than random), but I'd like to avoid the
specifics of that for the sake of this email.

To Alex's point, repairs are problematic with lower token counts due to
over streaming.  I think this is a pretty serious issue and I we'd have to
address it before going all the way down to 4.  This, in my opinion, is a
more complex problem to solve and I think trying to fix it here could make
shipping 4.0 take even longer, something none of us want.

For the sake of shipping 4.0 without adding extra overhead and time, I'm ok
with moving to 16 tokens, and in the process adding extensive documentation
outlining what we recommend for production use.  I think we should also try
to figure out something better than random as the default to fix the data
imbalance issues.  I've got a few ideas here I've been noodling on.

As long as folks are fine with potentially changing the default again in C*
5.0 (after another discussion / debate), 16 is enough of an improvement
that I'm OK with the change, and willing to author the docs to help people
set up their first cluster.  For folks that go into production with the
defaults, we're at least not setting them up for total failure once their
clusters get large like we are now.

In future versions, we'll probably want to address the issue of data
imbalance by building something in that shifts individual tokens around.  I
don't think we should try to do this in 4.0 either.

Jon



On Fri, Jan 31, 2020 at 2:04 PM Jeremy Hanna 
wrote:

> I think Mick and Anthony make some valid operational and skew points for
> smaller/starting clusters with 4 num_tokens. There’s an arbitrary line
> between small and large clusters but I think most would agree that most
> clusters are on the small to medium side. (A small nuance is afaict the
> probabilities have to do with quorum on a full token range, ie it has to do
> with the size of a datacenter not the full cluster
>
> As I read this discussion I’m personally more inclined to go with 16 for
> now. It’s true that if we could fix the skew and topology gotchas for those
> starting things up, 4 would be ideal from an availability perspective.
> However we’re still in the brainstorming stage for how to address those
> challenges. I think we should create tickets for those issues and go with
> 16 for 4.0.
>
> This is about an out of the box experience. It balances availability,
> operations (such as skew and general bootstrap friendliness and
> streaming/repair), and cluster sizing. Balancing all of those, I think for
> now I’m more comfortable with 16 as the default with docs on considerations
> and tickets to unblock 4 as the default for all users.
>
> >>> On Feb 1, 2020, at 6:30 AM, Jeff Jirsa  wrote:
> >> On Fri, Jan 31, 2020 at 11:25 AM Joseph Lynch 
> wrote:
> >> I think that we might be bikeshedding this number a bit because it is
> easy
> >> to debate and there is not yet one right answer.
> >
> >
> > https://www.youtube.com/watch?v=v465T5u9UKo
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [Discuss] num_tokens default in Cassandra 4.0

2020-01-31 Thread Jeremy Hanna
I think Mick and Anthony make some valid operational and skew points for 
smaller/starting clusters with 4 num_tokens. There’s an arbitrary line between 
small and large clusters but I think most would agree that most clusters are on 
the small to medium side. (A small nuance is afaict the probabilities have to 
do with quorum on a full token range, ie it has to do with the size of a 
datacenter not the full cluster

As I read this discussion I’m personally more inclined to go with 16 for now. 
It’s true that if we could fix the skew and topology gotchas for those starting 
things up, 4 would be ideal from an availability perspective. However we’re 
still in the brainstorming stage for how to address those challenges. I think 
we should create tickets for those issues and go with 16 for 4.0.

This is about an out of the box experience. It balances availability, 
operations (such as skew and general bootstrap friendliness and 
streaming/repair), and cluster sizing. Balancing all of those, I think for now 
I’m more comfortable with 16 as the default with docs on considerations and 
tickets to unblock 4 as the default for all users.

>>> On Feb 1, 2020, at 6:30 AM, Jeff Jirsa  wrote:
>> On Fri, Jan 31, 2020 at 11:25 AM Joseph Lynch  wrote:
>> I think that we might be bikeshedding this number a bit because it is easy
>> to debate and there is not yet one right answer.
> 
> 
> https://www.youtube.com/watch?v=v465T5u9UKo

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [Discuss] num_tokens default in Cassandra 4.0

2020-01-31 Thread Jeff Jirsa
On Fri, Jan 31, 2020 at 11:25 AM Joseph Lynch  wrote:

> I think that we might be bikeshedding this number a bit because it is easy
> to debate and there is not yet one right answer.
>


https://www.youtube.com/watch?v=v465T5u9UKo


Re: [Discuss] num_tokens default in Cassandra 4.0

2020-01-31 Thread Joseph Lynch
I think that we might be bikeshedding this number a bit because it is easy
to debate and there is not yet one right answer. I hope we recognize either
choice (4 or 16) is fine in that users can always override us and we can
always change our minds later or better yet improve allocation so users
don't have to care. Either choice is an improvement on the status quo. I
only truly care that when we change this default let's make sure that:
1. Users can still launch a cluster all at once. Last I checked even with
allocate_for_rf you need to bootstrap one node at a time for even
allocation to work properly; please someone correct me if I'm wrong, and if
I'm not let's get this fixed before the beta.
2. We get good documentation about this choice into our docs.
[documentation team and I are on it!]

I don't like phrasing this as a "small user" vs "large user" discussion.
Everybody using Cassandra wants it to be easy to operate with high
availability and consistent performance. Optimizing for "I can oops a few
nodes and not have an outage" is an important thing to optimize for
regardless of scale. It seems we have a lot of input on this thread that
we're frequently seeing users override this to 4 (apparently even with
random allocation? I am personally surprised by this if true). Some people
have indicated that they like a higher number like 16 or 32. Some (most?)
of our largest users by footprint are still using 1.

The only significant advantage I'm aware of for 16 over 4 is that users can
scale up and down in increments of N/16 (12 node cluster -> 1) instead of
N/4 (12 node cluster -> 3) without further token allocation improvements in
Cassandra. Practically speaking I think people are often spreading nodes
out over RF=3 "racks" (e.g. GCP, Azure, and AWS) so they'll want to scale
by increments of 3 anyways. I agree with Jon that optimizing for
scale-downs is odd; it's a pretty infrequent operation and all the users I
know doing autoscaling are doing it vertically using networked attached
storage (~EBS). Let's also remember repairing clusters with 16 tokens per
node is slower (probably about 2-4x slower) than repairing clusters with 4
tokens.

With zero copy streaming there should no benefit to more tokens for data
transfer, if there is, it is a bug in streaming performance and let's fix
it.
Honestly, in my opinion if we have balancing issues with small number of
tokens that is a bug and we should just fix it; token moves are safe, it is
definitely possible for Cassandra to just self-balance itself.

Let's not worry about scaring off users with this choice, choosing 4 will
not scare off users any more than 256 random tokens has scared off users
when they realized that they can't have any combination of two nodes down
in different racks.

-Joey

On Fri, Jan 31, 2020 at 10:16 AM Carl Mueller
 wrote:

> edit: 4 is bad at small cluster sizes and could scare off adoption
>
> On Fri, Jan 31, 2020 at 12:15 PM Carl Mueller <
> carl.muel...@smartthings.com>
> wrote:
>
> > "large/giant clusters and admins are the target audience for the value we
> > select"
> >
> > There are reasons aside from massive scale to pick cassandra, but the
> > primary reason cassandra is selected technically is to support vertically
> > scaling to large clusters.
> >
> > Why pick a value that once you reach scale you need to switch token
> count?
> > It's still a ticking time bomb, although 16 won't be what 256 is.
> >
> > H. But 4 is bad and could scare off adoption.
> >
> > Ultimately a well-written article on operations and how to transition
> from
> > 16 --> 4 and at what point that is a good idea (aka not when your cluster
> > is too big) should be a critical part of this.
> >
> > On Fri, Jan 31, 2020 at 11:45 AM Michael Shuler 
> > wrote:
> >
> >> On 1/31/20 9:58 AM, Dimitar Dimitrov wrote:
> >> > one corollary of the way the algorithm works (or more
> >> > precisely might not work) with multiple seeds or simultaneous
> >> > multi-node bootstraps or decommissions, is that a lot of dtests
> >> > start failing due to deterministic token conflicts. I wasn't
> >> > able to fix that by changing solely ccm and the dtests
> >> I appreciate all the detailed discussion. For a little historic context,
> >> since I brought up this topic in the contributors zoom meeting, unstable
> >> dtests was precisely the reason we moved the dtest configurations to
> >> 'num_tokens: 32'. That value has been used in CI dtest since something
> >> like 2014, when we found that this helped stabilize a large segment of
> >> flaky dtest failures. No real science there, other than "this hurts
> less."
> >>
> >> I have no real opinion on the suggestions of using 4 or 16, other than I
> >> believe most "default config using" new users are starting with smaller
> >> numbers of nodes. The small-but-growing users and veteran large cluster
> >> admins should be gaining more operational knowledge and be able to
> >> adjust their own config choices according to their needs 

Re: [Discuss] num_tokens default in Cassandra 4.0

2020-01-31 Thread Carl Mueller
"large/giant clusters and admins are the target audience for the value we
select"

There are reasons aside from massive scale to pick cassandra, but the
primary reason cassandra is selected technically is to support vertically
scaling to large clusters.

Why pick a value that once you reach scale you need to switch token count?
It's still a ticking time bomb, although 16 won't be what 256 is.

H. But 4 is bad and could scare off adoption.

Ultimately a well-written article on operations and how to transition from
16 --> 4 and at what point that is a good idea (aka not when your cluster
is too big) should be a critical part of this.

On Fri, Jan 31, 2020 at 11:45 AM Michael Shuler 
wrote:

> On 1/31/20 9:58 AM, Dimitar Dimitrov wrote:
> > one corollary of the way the algorithm works (or more
> > precisely might not work) with multiple seeds or simultaneous
> > multi-node bootstraps or decommissions, is that a lot of dtests
> > start failing due to deterministic token conflicts. I wasn't
> > able to fix that by changing solely ccm and the dtests
> I appreciate all the detailed discussion. For a little historic context,
> since I brought up this topic in the contributors zoom meeting, unstable
> dtests was precisely the reason we moved the dtest configurations to
> 'num_tokens: 32'. That value has been used in CI dtest since something
> like 2014, when we found that this helped stabilize a large segment of
> flaky dtest failures. No real science there, other than "this hurts less."
>
> I have no real opinion on the suggestions of using 4 or 16, other than I
> believe most "default config using" new users are starting with smaller
> numbers of nodes. The small-but-growing users and veteran large cluster
> admins should be gaining more operational knowledge and be able to
> adjust their own config choices according to their needs (and good
> comment suggestions in the yaml). Whatever default config value is
> chosen for num_tokens, I think it should suit the new users with smaller
> clusters. The suggestion Mick makes that 16 makes a better choice for
> small numbers of nodes, well, that would seem to be the better choice
> for those users we are trying to help the most with the default.
>
> I fully agree that science, maths, and support/ops experience should
> guide the choice, but I don't believe that large/giant clusters and
> admins are the target audience for the value we select.
>
> --
> Kind regards,
> Michael
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [Discuss] num_tokens default in Cassandra 4.0

2020-01-31 Thread Carl Mueller
edit: 4 is bad at small cluster sizes and could scare off adoption

On Fri, Jan 31, 2020 at 12:15 PM Carl Mueller 
wrote:

> "large/giant clusters and admins are the target audience for the value we
> select"
>
> There are reasons aside from massive scale to pick cassandra, but the
> primary reason cassandra is selected technically is to support vertically
> scaling to large clusters.
>
> Why pick a value that once you reach scale you need to switch token count?
> It's still a ticking time bomb, although 16 won't be what 256 is.
>
> H. But 4 is bad and could scare off adoption.
>
> Ultimately a well-written article on operations and how to transition from
> 16 --> 4 and at what point that is a good idea (aka not when your cluster
> is too big) should be a critical part of this.
>
> On Fri, Jan 31, 2020 at 11:45 AM Michael Shuler 
> wrote:
>
>> On 1/31/20 9:58 AM, Dimitar Dimitrov wrote:
>> > one corollary of the way the algorithm works (or more
>> > precisely might not work) with multiple seeds or simultaneous
>> > multi-node bootstraps or decommissions, is that a lot of dtests
>> > start failing due to deterministic token conflicts. I wasn't
>> > able to fix that by changing solely ccm and the dtests
>> I appreciate all the detailed discussion. For a little historic context,
>> since I brought up this topic in the contributors zoom meeting, unstable
>> dtests was precisely the reason we moved the dtest configurations to
>> 'num_tokens: 32'. That value has been used in CI dtest since something
>> like 2014, when we found that this helped stabilize a large segment of
>> flaky dtest failures. No real science there, other than "this hurts less."
>>
>> I have no real opinion on the suggestions of using 4 or 16, other than I
>> believe most "default config using" new users are starting with smaller
>> numbers of nodes. The small-but-growing users and veteran large cluster
>> admins should be gaining more operational knowledge and be able to
>> adjust their own config choices according to their needs (and good
>> comment suggestions in the yaml). Whatever default config value is
>> chosen for num_tokens, I think it should suit the new users with smaller
>> clusters. The suggestion Mick makes that 16 makes a better choice for
>> small numbers of nodes, well, that would seem to be the better choice
>> for those users we are trying to help the most with the default.
>>
>> I fully agree that science, maths, and support/ops experience should
>> guide the choice, but I don't believe that large/giant clusters and
>> admins are the target audience for the value we select.
>>
>> --
>> Kind regards,
>> Michael
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>
>>


Re: [Discuss] num_tokens default in Cassandra 4.0

2020-01-31 Thread Michael Shuler

On 1/31/20 9:58 AM, Dimitar Dimitrov wrote:

one corollary of the way the algorithm works (or more
precisely might not work) with multiple seeds or simultaneous
multi-node bootstraps or decommissions, is that a lot of dtests
start failing due to deterministic token conflicts. I wasn't
able to fix that by changing solely ccm and the dtests
I appreciate all the detailed discussion. For a little historic context, 
since I brought up this topic in the contributors zoom meeting, unstable 
dtests was precisely the reason we moved the dtest configurations to 
'num_tokens: 32'. That value has been used in CI dtest since something 
like 2014, when we found that this helped stabilize a large segment of 
flaky dtest failures. No real science there, other than "this hurts less."


I have no real opinion on the suggestions of using 4 or 16, other than I 
believe most "default config using" new users are starting with smaller 
numbers of nodes. The small-but-growing users and veteran large cluster 
admins should be gaining more operational knowledge and be able to 
adjust their own config choices according to their needs (and good 
comment suggestions in the yaml). Whatever default config value is 
chosen for num_tokens, I think it should suit the new users with smaller 
clusters. The suggestion Mick makes that 16 makes a better choice for 
small numbers of nodes, well, that would seem to be the better choice 
for those users we are trying to help the most with the default.


I fully agree that science, maths, and support/ops experience should 
guide the choice, but I don't believe that large/giant clusters and 
admins are the target audience for the value we select.


--
Kind regards,
Michael

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [Discuss] num_tokens default in Cassandra 4.0

2020-01-31 Thread Carl Mueller
So why even have virtual nodes at all, why not work on improving single
token approaches so that we can support cluster doubling, which IMO would
enable cassandra to more quickly scale for volatile loads?

It's my guess/understanding that vnodes eliminate the token rebalancing
that existed back in the days of single token. Did vnodes also help reduce
the amount of streamed data in rebalancing/expansion? VNodes also help the
streaming from multiple sources in expansion, but if it limits us to single
node expansion that really limits flexibility on large node count clusters.

Were there other advantages to VNodes that I missed?

IIRC High vnode count basically broke the secondary low cardinality
indexes, and vnode=4 might help that a lot.

But if 4 hasn't shown any balancing issues, I'm all for it.

On Fri, Jan 31, 2020 at 9:58 AM Dimitar Dimitrov 
wrote:

> Hey all,
>
> At some point not too long ago I spent some time trying to
> make the token allocation algorithm the default.
>
> I didn't foresee it, although it might be obvious for many of
> you, but one corollary of the way the algorithm works (or more
> precisely might not work) with multiple seeds or simultaneous
> multi-node bootstraps or decommissions, is that a lot of dtests
> start failing due to deterministic token conflicts. I wasn't
> able to fix that by changing solely ccm and the dtests, unless
> careful, sequential node bootstrap was enforced. While it's strongly
> suggested to users to do exactly that in the real world, it would
> have exploded dtest run times to unacceptable levels.
>
> I have to clarify that what I'm working with is not exactly
> C*, and my knowledge of the C* codebase is not as up to date as
> I would want it to, but I suspect that the above problem might very
> well affect C* too, in which case changing the defaults might
> be a less-than-trivial undertaking.
>
> Regards,
> Dimitar
>
> На пт, 31.01.2020 г. в 17:20 Joshua McKenzie 
> написа:
>
> > >
> > > We should be using the default value that benefits the most people,
> > rather
> > > than an arbitrary compromise.
> >
> > I'd caution we're talking about the default value *we believe* will
> benefit
> > the most people according to our respective understandings of C* usage.
> >
> >  Most clusters don't shrink, they stay the same size or grow. I'd say 90%
> > > or more fall in this category.
> >
> > While I agree with the "most don't shrink, they stay the same or grow"
> > claim intuitively, there's a distinct difference impacting the 4 vs. 16
> > debate between what ratio we think stays the same size and what ratio we
> > think grows that I think informs this discussion.
> >
> > There's a *lot* of Cassandra out in the world, and these changes are
> going
> > to impact all of it. I'm not advocating a certain position on 4 vs. 16,
> but
> > I do think we need to be very careful about how strongly we hold our
> > beliefs and present them as facts in discussions like this.
> >
> > For my unsolicited .02, it sounds an awful lot like we're stuck between a
> > rock and a hard place in that there is no correct "one size fits all"
> > answer here (or, said another way: both 4 and 16 are correct, just for
> > different cases and we don't know / agree on which one we think is the
> > right one to target), so perhaps a discussion on a smart evolution of
> token
> > allocation counts based on quantized tiers of cluster size and dataset
> > growth (either automated or through operational best practices) could be
> > valuable along with this.
> >
> > On Fri, Jan 31, 2020 at 8:57 AM Alexander Dejanovski <
> > a...@thelastpickle.com>
> > wrote:
> >
> > > While I (mostly) understand the maths behind using 4 vnodes as a
> default
> > > (which really is a question of extreme availability), I don't think
> they
> > > provide noticeable performance improvements over using 16, while 16
> > vnodes
> > > will protect folks from imbalances. It is very hard to deal with
> > unbalanced
> > > clusters, and people start to deal with it once some nodes are already
> > > close to being full. Operationally, it's far from trivial.
> > > We're going to make some experiments at bootstrapping clusters with 4
> > > tokens on the latest alpha to see how much balance we can expect, and
> how
> > > removing one node could impact it.
> > >
> > > If we're talking about repairs, using 4 vnodes will generate
> > overstreaming,
> > > which can create lots of serious performance issues. Even on clusters
> > with
> > > 500GB of node density, we never use less than ~15 segments per node
> with
> > > Reaper.
> > > Not everyone uses Reaper, obviously, and there will be no protection
> > > against overstreaming with such a low default for folks not using
> > subrange
> > > repairs.
> > > On small clusters, even with 256 vnodes, using Cassandra 3.0/3.x and
> > Reaper
> > > already allows to get good repair performance because token ranges
> > sharing
> > > the exact same replicas will be processed in a single repair 

Re: [Discuss] num_tokens default in Cassandra 4.0

2020-01-31 Thread Dimitar Dimitrov
Hey all,

At some point not too long ago I spent some time trying to
make the token allocation algorithm the default.

I didn't foresee it, although it might be obvious for many of
you, but one corollary of the way the algorithm works (or more
precisely might not work) with multiple seeds or simultaneous
multi-node bootstraps or decommissions, is that a lot of dtests
start failing due to deterministic token conflicts. I wasn't
able to fix that by changing solely ccm and the dtests, unless
careful, sequential node bootstrap was enforced. While it's strongly
suggested to users to do exactly that in the real world, it would
have exploded dtest run times to unacceptable levels.

I have to clarify that what I'm working with is not exactly
C*, and my knowledge of the C* codebase is not as up to date as
I would want it to, but I suspect that the above problem might very
well affect C* too, in which case changing the defaults might
be a less-than-trivial undertaking.

Regards,
Dimitar

На пт, 31.01.2020 г. в 17:20 Joshua McKenzie  написа:

> >
> > We should be using the default value that benefits the most people,
> rather
> > than an arbitrary compromise.
>
> I'd caution we're talking about the default value *we believe* will benefit
> the most people according to our respective understandings of C* usage.
>
>  Most clusters don't shrink, they stay the same size or grow. I'd say 90%
> > or more fall in this category.
>
> While I agree with the "most don't shrink, they stay the same or grow"
> claim intuitively, there's a distinct difference impacting the 4 vs. 16
> debate between what ratio we think stays the same size and what ratio we
> think grows that I think informs this discussion.
>
> There's a *lot* of Cassandra out in the world, and these changes are going
> to impact all of it. I'm not advocating a certain position on 4 vs. 16, but
> I do think we need to be very careful about how strongly we hold our
> beliefs and present them as facts in discussions like this.
>
> For my unsolicited .02, it sounds an awful lot like we're stuck between a
> rock and a hard place in that there is no correct "one size fits all"
> answer here (or, said another way: both 4 and 16 are correct, just for
> different cases and we don't know / agree on which one we think is the
> right one to target), so perhaps a discussion on a smart evolution of token
> allocation counts based on quantized tiers of cluster size and dataset
> growth (either automated or through operational best practices) could be
> valuable along with this.
>
> On Fri, Jan 31, 2020 at 8:57 AM Alexander Dejanovski <
> a...@thelastpickle.com>
> wrote:
>
> > While I (mostly) understand the maths behind using 4 vnodes as a default
> > (which really is a question of extreme availability), I don't think they
> > provide noticeable performance improvements over using 16, while 16
> vnodes
> > will protect folks from imbalances. It is very hard to deal with
> unbalanced
> > clusters, and people start to deal with it once some nodes are already
> > close to being full. Operationally, it's far from trivial.
> > We're going to make some experiments at bootstrapping clusters with 4
> > tokens on the latest alpha to see how much balance we can expect, and how
> > removing one node could impact it.
> >
> > If we're talking about repairs, using 4 vnodes will generate
> overstreaming,
> > which can create lots of serious performance issues. Even on clusters
> with
> > 500GB of node density, we never use less than ~15 segments per node with
> > Reaper.
> > Not everyone uses Reaper, obviously, and there will be no protection
> > against overstreaming with such a low default for folks not using
> subrange
> > repairs.
> > On small clusters, even with 256 vnodes, using Cassandra 3.0/3.x and
> Reaper
> > already allows to get good repair performance because token ranges
> sharing
> > the exact same replicas will be processed in a single repair session. On
> > large clusters, I reckon it's good to have way less vnodes to speed up
> > repairs.
> >
> > Cassandra 4.0 is supposed to aim at providing a rock stable release of
> > Cassandra, fixing past instabilities, and I think lowering to 4 tokens by
> > default defeats that purpose.
> > 16 tokens is a reasonable compromise for clusters of all sizes, without
> > being too aggressive. Those with enough C* experience can still lower
> that
> > number for their clusters.
> >
> > Cheers,
> >
> > -
> > Alexander Dejanovski
> > France
> > @alexanderdeja
> >
> > Consultant
> > Apache Cassandra Consulting
> > http://www.thelastpickle.com
> >
> >
> > On Fri, Jan 31, 2020 at 1:41 PM Mick Semb Wever  wrote:
> >
> > >
> > > > TLDR, based on availability concerns, skew concerns, operational
> > > > concerns, and based on the fact that the new allocation algorithm can
> > > > be configured fairly simply now, this is a proposal to go with 4 as
> the
> > > > new default and the allocate_tokens_for_local_replication_factor set
> to
> 

Re: [Discuss] num_tokens default in Cassandra 4.0

2020-01-31 Thread Joshua McKenzie
>
> We should be using the default value that benefits the most people, rather
> than an arbitrary compromise.

I'd caution we're talking about the default value *we believe* will benefit
the most people according to our respective understandings of C* usage.

 Most clusters don't shrink, they stay the same size or grow. I'd say 90%
> or more fall in this category.

While I agree with the "most don't shrink, they stay the same or grow"
claim intuitively, there's a distinct difference impacting the 4 vs. 16
debate between what ratio we think stays the same size and what ratio we
think grows that I think informs this discussion.

There's a *lot* of Cassandra out in the world, and these changes are going
to impact all of it. I'm not advocating a certain position on 4 vs. 16, but
I do think we need to be very careful about how strongly we hold our
beliefs and present them as facts in discussions like this.

For my unsolicited .02, it sounds an awful lot like we're stuck between a
rock and a hard place in that there is no correct "one size fits all"
answer here (or, said another way: both 4 and 16 are correct, just for
different cases and we don't know / agree on which one we think is the
right one to target), so perhaps a discussion on a smart evolution of token
allocation counts based on quantized tiers of cluster size and dataset
growth (either automated or through operational best practices) could be
valuable along with this.

On Fri, Jan 31, 2020 at 8:57 AM Alexander Dejanovski 
wrote:

> While I (mostly) understand the maths behind using 4 vnodes as a default
> (which really is a question of extreme availability), I don't think they
> provide noticeable performance improvements over using 16, while 16 vnodes
> will protect folks from imbalances. It is very hard to deal with unbalanced
> clusters, and people start to deal with it once some nodes are already
> close to being full. Operationally, it's far from trivial.
> We're going to make some experiments at bootstrapping clusters with 4
> tokens on the latest alpha to see how much balance we can expect, and how
> removing one node could impact it.
>
> If we're talking about repairs, using 4 vnodes will generate overstreaming,
> which can create lots of serious performance issues. Even on clusters with
> 500GB of node density, we never use less than ~15 segments per node with
> Reaper.
> Not everyone uses Reaper, obviously, and there will be no protection
> against overstreaming with such a low default for folks not using subrange
> repairs.
> On small clusters, even with 256 vnodes, using Cassandra 3.0/3.x and Reaper
> already allows to get good repair performance because token ranges sharing
> the exact same replicas will be processed in a single repair session. On
> large clusters, I reckon it's good to have way less vnodes to speed up
> repairs.
>
> Cassandra 4.0 is supposed to aim at providing a rock stable release of
> Cassandra, fixing past instabilities, and I think lowering to 4 tokens by
> default defeats that purpose.
> 16 tokens is a reasonable compromise for clusters of all sizes, without
> being too aggressive. Those with enough C* experience can still lower that
> number for their clusters.
>
> Cheers,
>
> -
> Alexander Dejanovski
> France
> @alexanderdeja
>
> Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>
>
> On Fri, Jan 31, 2020 at 1:41 PM Mick Semb Wever  wrote:
>
> >
> > > TLDR, based on availability concerns, skew concerns, operational
> > > concerns, and based on the fact that the new allocation algorithm can
> > > be configured fairly simply now, this is a proposal to go with 4 as the
> > > new default and the allocate_tokens_for_local_replication_factor set to
> > > 3.
> >
> >
> > I'm uncomfortable going with the default of `num_tokens: 4`.
> > I would rather see a default of `num_tokens: 16` based on the following…
> >
> > a) 4 num_tokens does not provide a good out-of-the-box experience.
> > b) 4 num_tokens doesn't provide any significant streaming benefits over
> 16.
> > c)  edge-case availability doesn't trump (a) & (b)
> >
> >
> > For (a)…
> >  The first node in each rack, up to RF racks, in each datacenter can't
> use
> > the allocation strategy. With 4 num_tokens, 3 racks and RF=3, the first
> > three nodes will be poorly balanced. If three poorly unbalanced nodes in
> a
> > cluster is an issue (because the cluster is small enough) therefore 4 is
> > the wrong default. From our own experience, we have had to bootstrap
> these
> > nodes multiple times until they generate something ok. In practice 4
> > num_tokens (over 16) has provided more headache with clients than gain.
> >
> > Elaborating, 256 was originally chosen because the token randomness over
> > that many always averaged out. With a default of
> > `allocate_tokens_for_local_replication_factor: 3` this issue is largely
> > solved, but you will still have those initial nodes with randomly
> generated
> > tokens. Ref:
> >
> 

Re: [Discuss] num_tokens default in Cassandra 4.0

2020-01-31 Thread Alexander Dejanovski
While I (mostly) understand the maths behind using 4 vnodes as a default
(which really is a question of extreme availability), I don't think they
provide noticeable performance improvements over using 16, while 16 vnodes
will protect folks from imbalances. It is very hard to deal with unbalanced
clusters, and people start to deal with it once some nodes are already
close to being full. Operationally, it's far from trivial.
We're going to make some experiments at bootstrapping clusters with 4
tokens on the latest alpha to see how much balance we can expect, and how
removing one node could impact it.

If we're talking about repairs, using 4 vnodes will generate overstreaming,
which can create lots of serious performance issues. Even on clusters with
500GB of node density, we never use less than ~15 segments per node with
Reaper.
Not everyone uses Reaper, obviously, and there will be no protection
against overstreaming with such a low default for folks not using subrange
repairs.
On small clusters, even with 256 vnodes, using Cassandra 3.0/3.x and Reaper
already allows to get good repair performance because token ranges sharing
the exact same replicas will be processed in a single repair session. On
large clusters, I reckon it's good to have way less vnodes to speed up
repairs.

Cassandra 4.0 is supposed to aim at providing a rock stable release of
Cassandra, fixing past instabilities, and I think lowering to 4 tokens by
default defeats that purpose.
16 tokens is a reasonable compromise for clusters of all sizes, without
being too aggressive. Those with enough C* experience can still lower that
number for their clusters.

Cheers,

-
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


On Fri, Jan 31, 2020 at 1:41 PM Mick Semb Wever  wrote:

>
> > TLDR, based on availability concerns, skew concerns, operational
> > concerns, and based on the fact that the new allocation algorithm can
> > be configured fairly simply now, this is a proposal to go with 4 as the
> > new default and the allocate_tokens_for_local_replication_factor set to
> > 3.
>
>
> I'm uncomfortable going with the default of `num_tokens: 4`.
> I would rather see a default of `num_tokens: 16` based on the following…
>
> a) 4 num_tokens does not provide a good out-of-the-box experience.
> b) 4 num_tokens doesn't provide any significant streaming benefits over 16.
> c)  edge-case availability doesn't trump (a) & (b)
>
>
> For (a)…
>  The first node in each rack, up to RF racks, in each datacenter can't use
> the allocation strategy. With 4 num_tokens, 3 racks and RF=3, the first
> three nodes will be poorly balanced. If three poorly unbalanced nodes in a
> cluster is an issue (because the cluster is small enough) therefore 4 is
> the wrong default. From our own experience, we have had to bootstrap these
> nodes multiple times until they generate something ok. In practice 4
> num_tokens (over 16) has provided more headache with clients than gain.
>
> Elaborating, 256 was originally chosen because the token randomness over
> that many always averaged out. With a default of
> `allocate_tokens_for_local_replication_factor: 3` this issue is largely
> solved, but you will still have those initial nodes with randomly generated
> tokens. Ref:
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/dht/tokenallocator/ReplicationAwareTokenAllocator.java#L80
> And to be precise: tokens are randomly generated until there is a node in
> each rack up to RF racks. So, if you have RF=3, in theory (or are a newbie)
> you could boot 100 nodes only in the first two racks, and they will all be
> random tokens regardless of the
> allocate_tokens_for_local_replication_factor setting.
>
> For example, using 4 num_tokens, 3 racks and RF=3…
>  - in a 6 node cluster, there's a total of 24 tokens, half of which are
> random,
>  - in a 9 node cluster, there's a total of 36 tokens, a third of which is
> random,
>  - etc
>
> Following this logic i would not be willing to apply 4 unless you know
> there will be more than 36 nodes in each data centre, ie less than ~8% of
> your tokens are randomly generated. Many clusters don't have that size, and
> imho that's why 4 is a bad default.
>
> A default of 16 by the same logic only needs 9 nodes in each dc to
> overcome that randomness degree.
>
> The workaround to all this is having to manually define `initial_token: …`
> on those initial nodes. I'm really not inspired imposing that upon new
> users.
>
> For (b)…
>  there's been a number of improvements already around streaming that
> solves much of what would be any difference there is between 4 and 16
> num_tokens. And 4 num_tokens means bigger token ranges so could well be
> disadvantageous due to over-streaming.
>
> For (c)…
>  we are trying to optimise availability in situations we can never
> guarantee availability. I understand it's a nice operational advantage to
> have in a 

Re: [Discuss] num_tokens default in Cassandra 4.0

2020-01-31 Thread Mick Semb Wever


> TLDR, based on availability concerns, skew concerns, operational 
> concerns, and based on the fact that the new allocation algorithm can 
> be configured fairly simply now, this is a proposal to go with 4 as the 
> new default and the allocate_tokens_for_local_replication_factor set to 
> 3.  


I'm uncomfortable going with the default of `num_tokens: 4`.
I would rather see a default of `num_tokens: 16` based on the following…

a) 4 num_tokens does not provide a good out-of-the-box experience.
b) 4 num_tokens doesn't provide any significant streaming benefits over 16.
c)  edge-case availability doesn't trump (a) & (b)


For (a)…
 The first node in each rack, up to RF racks, in each datacenter can't use the 
allocation strategy. With 4 num_tokens, 3 racks and RF=3, the first three nodes 
will be poorly balanced. If three poorly unbalanced nodes in a cluster is an 
issue (because the cluster is small enough) therefore 4 is the wrong default. 
From our own experience, we have had to bootstrap these nodes multiple times 
until they generate something ok. In practice 4 num_tokens (over 16) has 
provided more headache with clients than gain.

Elaborating, 256 was originally chosen because the token randomness over that 
many always averaged out. With a default of  
`allocate_tokens_for_local_replication_factor: 3` this issue is largely solved, 
but you will still have those initial nodes with randomly generated tokens. 
Ref: 
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/dht/tokenallocator/ReplicationAwareTokenAllocator.java#L80
And to be precise: tokens are randomly generated until there is a node in each 
rack up to RF racks. So, if you have RF=3, in theory (or are a newbie) you 
could boot 100 nodes only in the first two racks, and they will all be random 
tokens regardless of the allocate_tokens_for_local_replication_factor setting.

For example, using 4 num_tokens, 3 racks and RF=3…
 - in a 6 node cluster, there's a total of 24 tokens, half of which are random,
 - in a 9 node cluster, there's a total of 36 tokens, a third of which is 
random,
 - etc

Following this logic i would not be willing to apply 4 unless you know there 
will be more than 36 nodes in each data centre, ie less than ~8% of your tokens 
are randomly generated. Many clusters don't have that size, and imho that's why 
4 is a bad default. 

A default of 16 by the same logic only needs 9 nodes in each dc to overcome 
that randomness degree.

The workaround to all this is having to manually define `initial_token: …` on 
those initial nodes. I'm really not inspired imposing that upon new users.

For (b)…
 there's been a number of improvements already around streaming that solves 
much of what would be any difference there is between 4 and 16 num_tokens. And 
4 num_tokens means bigger token ranges so could well be disadvantageous due to 
over-streaming.

For (c)…
 we are trying to optimise availability in situations we can never guarantee 
availability. I understand it's a nice operational advantage to have in a 
shit-show, but it's not a systems design that you can design and rely upon. 
There's also the question of availability vs the size of the token-range that 
becomes unavailable.



regards,
Mick


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [Discuss] num_tokens default in Cassandra 4.0

2020-01-30 Thread Jon Haddad
Yes, I'm against it. We should be using the default value that benefits the
most people, rather than an arbitrary compromise.

Most clusters don't shrink, they stay the same size or grow. I'd say 90% or
more fall in this category.  Let's do the right thing by default and
include good comments that help people make the right decision if they
think they'll be outside the usual case.

On Thu, Jan 30, 2020, 8:07 PM Joseph Lynch  wrote:

> Any objections to the compromise of 16 as proposed in Chris's original
> patch?
>
> -Joey
>
> On Thu, Jan 30, 2020, 3:47 PM Anthony Grasso 
> wrote:
>
> > I think lowering the number of tokens is a great idea! Similar to Jon,
> when
> > I have reduced the number of tokens for clients it has been improvement
> in
> > repair performance.
> >
> > I am concerned that the proposed default value for num_tokens is too low.
> > If you set up a cluster using the proposed defaults, you will get a
> > balanced cluster. However, if you decommission nodes you will start to
> see
> > large imbalances especially for small clusters (< 20 nodes). This is
> > because the allocate_tokens_for_local_replication_factor setting is only
> > applied during the bootstrap process.
> >
> > I have recommended very low values for num_tokens to clients. This was
> > because it was very unlikely that they would reduce their cluster size
> and
> > I warned them of the caveats with using a small value for num_tokens.
> >
> > The proposed num_token default value is fine for devs and operators that
> > know what they are doing. However, the general Cassandra community will
> be
> > unaware of the potential issue with such a low value. We should consider
> > setting num_tokens to 16 - 32 as the default. This will at least help
> > reduce the severity of the imbalance when decommissioning a node whilst
> > still providing the benefits of having a low number of tokens. In
> addition,
> > we can add a comment to num_tokens that clusters over 100 nodes (per
> > datacenter) should consider reducing it down to 4.
> >
> > Cheers,
> > Anthony
> >
> > On Fri, 31 Jan 2020 at 01:58, Jon Haddad  wrote:
> >
> > > Larger clusters is where high token counts do the most damage. That's
> why
> > > it's such a problem. You start out with a small cluster using 256, as
> you
> > > grow into the hundreds it becomes more and more unstable.
> > >
> > >
> > > On Thu, Jan 30, 2020, 8:19 AM onmstester onmstester
> > >  wrote:
> > >
> > > > Shouldn't we consider the cluster size to configure num_tokens?
> > > >
> > > > For example is it OK to use num_tokens=4 for a cluster of more than
> 100
> > > of
> > > > nodes?
> > > >
> > > >
> > > >
> > > > Another question that is not so much relevant to this :
> > > >
> > > > When we use the token assignment algorithm (the new/non-random one)
> > for a
> > > > specific keyspace, why should we use initial token for all the seeds,
> > > isn't
> > > > one seed enough and then just set the keyspace for all other nodes?
> > > >
> > > >
> > > >
> > > > Also i do not understand why should we consider rack topology and
> > number
> > > > of racks for configuration of num_tokens?
> > > >
> > > >
> > > >
> > > > Sent using https://www.zoho.com/mail/
> > > >
> > > >
> > > >
> > > >
> > > >  On Thu, 30 Jan 2020 04:33:57 +0330 Jeremy Hanna <
> > > > jeremy.hanna1...@gmail.com> wrote 
> > > >
> > > >
> > > > The new default wouldn't be retroactively set for 3.x, but the same
> > > > principles apply.  The new algorithm is in 3.x as well as the
> > > > simplification of the configuration.  So no reason not to use the
> same
> > > > configuration on 3.x.
> > > >
> > > > > On Jan 30, 2020, at 4:34 AM, Chen-Becker, Derek  > > > dchen...@amazon.com.INVALID> wrote:
> > > > >
> > > > > Does the same guidance apply to 3.x clusters? I read through the
> JIRA
> > > > ticket linked below, along with tickets that it links to, but it's
> not
> > > > clear that the new allocation algorithm is available in 3.x or if
> there
> > > are
> > > > other reasons that this would be problematic.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Derek
> > > > >
> > > > > On 1/29/20, 9:54 AM, "Jon Haddad" 
> wrote:
> > > > >
> > > > >Ive put a lot of my previous clients on 4 tokens, all of which
> > have
> > > > >resulted in a major improvement.
> > > > >
> > > > >I wouldn't use any more than 4 except under some pretty unusual
> > > > >circumstances.
> > > > >
> > > > >Jon
> > > > >
> > > > >On Wed, Jan 29, 2020, 11:18 AM Ben Bromhead  > > > b...@instaclustr.com> wrote:
> > > > >
> > > > >> +1 to reducing the number of tokens as low as possible for
> > > availability
> > > > >> issues. 4 lgtm
> > > > >>
> > > > >> On Wed, Jan 29, 2020 at 1:14 AM Dinesh Joshi  > > djo...@apache.org>
> > > > wrote:
> > > > >>
> > > > >>> Thanks for restarting this discussion Jeremy. I personally think
> 4
> > is
> > > > a
> > > > >>> good number as a default. I think whatever we pick, we should
> 

Re: [Discuss] num_tokens default in Cassandra 4.0

2020-01-30 Thread Joseph Lynch
Any objections to the compromise of 16 as proposed in Chris's original
patch?

-Joey

On Thu, Jan 30, 2020, 3:47 PM Anthony Grasso 
wrote:

> I think lowering the number of tokens is a great idea! Similar to Jon, when
> I have reduced the number of tokens for clients it has been improvement in
> repair performance.
>
> I am concerned that the proposed default value for num_tokens is too low.
> If you set up a cluster using the proposed defaults, you will get a
> balanced cluster. However, if you decommission nodes you will start to see
> large imbalances especially for small clusters (< 20 nodes). This is
> because the allocate_tokens_for_local_replication_factor setting is only
> applied during the bootstrap process.
>
> I have recommended very low values for num_tokens to clients. This was
> because it was very unlikely that they would reduce their cluster size and
> I warned them of the caveats with using a small value for num_tokens.
>
> The proposed num_token default value is fine for devs and operators that
> know what they are doing. However, the general Cassandra community will be
> unaware of the potential issue with such a low value. We should consider
> setting num_tokens to 16 - 32 as the default. This will at least help
> reduce the severity of the imbalance when decommissioning a node whilst
> still providing the benefits of having a low number of tokens. In addition,
> we can add a comment to num_tokens that clusters over 100 nodes (per
> datacenter) should consider reducing it down to 4.
>
> Cheers,
> Anthony
>
> On Fri, 31 Jan 2020 at 01:58, Jon Haddad  wrote:
>
> > Larger clusters is where high token counts do the most damage. That's why
> > it's such a problem. You start out with a small cluster using 256, as you
> > grow into the hundreds it becomes more and more unstable.
> >
> >
> > On Thu, Jan 30, 2020, 8:19 AM onmstester onmstester
> >  wrote:
> >
> > > Shouldn't we consider the cluster size to configure num_tokens?
> > >
> > > For example is it OK to use num_tokens=4 for a cluster of more than 100
> > of
> > > nodes?
> > >
> > >
> > >
> > > Another question that is not so much relevant to this :
> > >
> > > When we use the token assignment algorithm (the new/non-random one)
> for a
> > > specific keyspace, why should we use initial token for all the seeds,
> > isn't
> > > one seed enough and then just set the keyspace for all other nodes?
> > >
> > >
> > >
> > > Also i do not understand why should we consider rack topology and
> number
> > > of racks for configuration of num_tokens?
> > >
> > >
> > >
> > > Sent using https://www.zoho.com/mail/
> > >
> > >
> > >
> > >
> > >  On Thu, 30 Jan 2020 04:33:57 +0330 Jeremy Hanna <
> > > jeremy.hanna1...@gmail.com> wrote 
> > >
> > >
> > > The new default wouldn't be retroactively set for 3.x, but the same
> > > principles apply.  The new algorithm is in 3.x as well as the
> > > simplification of the configuration.  So no reason not to use the same
> > > configuration on 3.x.
> > >
> > > > On Jan 30, 2020, at 4:34 AM, Chen-Becker, Derek  > > dchen...@amazon.com.INVALID> wrote:
> > > >
> > > > Does the same guidance apply to 3.x clusters? I read through the JIRA
> > > ticket linked below, along with tickets that it links to, but it's not
> > > clear that the new allocation algorithm is available in 3.x or if there
> > are
> > > other reasons that this would be problematic.
> > > >
> > > > Thanks,
> > > >
> > > > Derek
> > > >
> > > > On 1/29/20, 9:54 AM, "Jon Haddad"  wrote:
> > > >
> > > >Ive put a lot of my previous clients on 4 tokens, all of which
> have
> > > >resulted in a major improvement.
> > > >
> > > >I wouldn't use any more than 4 except under some pretty unusual
> > > >circumstances.
> > > >
> > > >Jon
> > > >
> > > >On Wed, Jan 29, 2020, 11:18 AM Ben Bromhead  > > b...@instaclustr.com> wrote:
> > > >
> > > >> +1 to reducing the number of tokens as low as possible for
> > availability
> > > >> issues. 4 lgtm
> > > >>
> > > >> On Wed, Jan 29, 2020 at 1:14 AM Dinesh Joshi  > djo...@apache.org>
> > > wrote:
> > > >>
> > > >>> Thanks for restarting this discussion Jeremy. I personally think 4
> is
> > > a
> > > >>> good number as a default. I think whatever we pick, we should have
> > > enough
> > > >>> documentation for operators to make sense of the new defaults in
> 4.0.
> > > >>>
> > > >>> Dinesh
> > > >>>
> > >  On Jan 28, 2020, at 9:25 PM, Jeremy Hanna  > > jeremy.hanna1...@gmail.com>
> > > >>> wrote:
> > > 
> > >  I wanted to start a discussion about the default for num_tokens
> that
> > > >>> we'd like for people starting in Cassandra 4.0.  This is for ticket
> > > >>> CASSANDRA-13701 <
> > https://issues.apache.org/jira/browse/CASSANDRA-13701>
> > >
> > > >>> (which has been duplicated a number of times, most recently by me).
> > > 
> > >  TLDR, based on availability concerns, skew concerns, operational
> > > >>> concerns, and based on the fact 

Re: [Discuss] num_tokens default in Cassandra 4.0

2020-01-30 Thread Anthony Grasso
I think lowering the number of tokens is a great idea! Similar to Jon, when
I have reduced the number of tokens for clients it has been improvement in
repair performance.

I am concerned that the proposed default value for num_tokens is too low.
If you set up a cluster using the proposed defaults, you will get a
balanced cluster. However, if you decommission nodes you will start to see
large imbalances especially for small clusters (< 20 nodes). This is
because the allocate_tokens_for_local_replication_factor setting is only
applied during the bootstrap process.

I have recommended very low values for num_tokens to clients. This was
because it was very unlikely that they would reduce their cluster size and
I warned them of the caveats with using a small value for num_tokens.

The proposed num_token default value is fine for devs and operators that
know what they are doing. However, the general Cassandra community will be
unaware of the potential issue with such a low value. We should consider
setting num_tokens to 16 - 32 as the default. This will at least help
reduce the severity of the imbalance when decommissioning a node whilst
still providing the benefits of having a low number of tokens. In addition,
we can add a comment to num_tokens that clusters over 100 nodes (per
datacenter) should consider reducing it down to 4.

Cheers,
Anthony

On Fri, 31 Jan 2020 at 01:58, Jon Haddad  wrote:

> Larger clusters is where high token counts do the most damage. That's why
> it's such a problem. You start out with a small cluster using 256, as you
> grow into the hundreds it becomes more and more unstable.
>
>
> On Thu, Jan 30, 2020, 8:19 AM onmstester onmstester
>  wrote:
>
> > Shouldn't we consider the cluster size to configure num_tokens?
> >
> > For example is it OK to use num_tokens=4 for a cluster of more than 100
> of
> > nodes?
> >
> >
> >
> > Another question that is not so much relevant to this :
> >
> > When we use the token assignment algorithm (the new/non-random one) for a
> > specific keyspace, why should we use initial token for all the seeds,
> isn't
> > one seed enough and then just set the keyspace for all other nodes?
> >
> >
> >
> > Also i do not understand why should we consider rack topology and number
> > of racks for configuration of num_tokens?
> >
> >
> >
> > Sent using https://www.zoho.com/mail/
> >
> >
> >
> >
> >  On Thu, 30 Jan 2020 04:33:57 +0330 Jeremy Hanna <
> > jeremy.hanna1...@gmail.com> wrote 
> >
> >
> > The new default wouldn't be retroactively set for 3.x, but the same
> > principles apply.  The new algorithm is in 3.x as well as the
> > simplification of the configuration.  So no reason not to use the same
> > configuration on 3.x.
> >
> > > On Jan 30, 2020, at 4:34 AM, Chen-Becker, Derek  > dchen...@amazon.com.INVALID> wrote:
> > >
> > > Does the same guidance apply to 3.x clusters? I read through the JIRA
> > ticket linked below, along with tickets that it links to, but it's not
> > clear that the new allocation algorithm is available in 3.x or if there
> are
> > other reasons that this would be problematic.
> > >
> > > Thanks,
> > >
> > > Derek
> > >
> > > On 1/29/20, 9:54 AM, "Jon Haddad"  wrote:
> > >
> > >Ive put a lot of my previous clients on 4 tokens, all of which have
> > >resulted in a major improvement.
> > >
> > >I wouldn't use any more than 4 except under some pretty unusual
> > >circumstances.
> > >
> > >Jon
> > >
> > >On Wed, Jan 29, 2020, 11:18 AM Ben Bromhead  > b...@instaclustr.com> wrote:
> > >
> > >> +1 to reducing the number of tokens as low as possible for
> availability
> > >> issues. 4 lgtm
> > >>
> > >> On Wed, Jan 29, 2020 at 1:14 AM Dinesh Joshi  djo...@apache.org>
> > wrote:
> > >>
> > >>> Thanks for restarting this discussion Jeremy. I personally think 4 is
> > a
> > >>> good number as a default. I think whatever we pick, we should have
> > enough
> > >>> documentation for operators to make sense of the new defaults in 4.0.
> > >>>
> > >>> Dinesh
> > >>>
> >  On Jan 28, 2020, at 9:25 PM, Jeremy Hanna  > jeremy.hanna1...@gmail.com>
> > >>> wrote:
> > 
> >  I wanted to start a discussion about the default for num_tokens that
> > >>> we'd like for people starting in Cassandra 4.0.  This is for ticket
> > >>> CASSANDRA-13701 <
> https://issues.apache.org/jira/browse/CASSANDRA-13701>
> >
> > >>> (which has been duplicated a number of times, most recently by me).
> > 
> >  TLDR, based on availability concerns, skew concerns, operational
> > >>> concerns, and based on the fact that the new allocation algorithm can
> > be
> > >>> configured fairly simply now, this is a proposal to go with 4 as the
> > new
> > >>> default and the allocate_tokens_for_local_replication_factor set to
> 3.
> > >>> That gives a good experience out of the box for people and is the
> most
> > >>> conservative.  It does assume that racks and DCs have been configured
> > >>> correctly.  We would, 

Re: [Discuss] num_tokens default in Cassandra 4.0

2020-01-30 Thread Jon Haddad
Larger clusters is where high token counts do the most damage. That's why
it's such a problem. You start out with a small cluster using 256, as you
grow into the hundreds it becomes more and more unstable.


On Thu, Jan 30, 2020, 8:19 AM onmstester onmstester
 wrote:

> Shouldn't we consider the cluster size to configure num_tokens?
>
> For example is it OK to use num_tokens=4 for a cluster of more than 100 of
> nodes?
>
>
>
> Another question that is not so much relevant to this :
>
> When we use the token assignment algorithm (the new/non-random one) for a
> specific keyspace, why should we use initial token for all the seeds, isn't
> one seed enough and then just set the keyspace for all other nodes?
>
>
>
> Also i do not understand why should we consider rack topology and number
> of racks for configuration of num_tokens?
>
>
>
> Sent using https://www.zoho.com/mail/
>
>
>
>
>  On Thu, 30 Jan 2020 04:33:57 +0330 Jeremy Hanna <
> jeremy.hanna1...@gmail.com> wrote 
>
>
> The new default wouldn't be retroactively set for 3.x, but the same
> principles apply.  The new algorithm is in 3.x as well as the
> simplification of the configuration.  So no reason not to use the same
> configuration on 3.x.
>
> > On Jan 30, 2020, at 4:34 AM, Chen-Becker, Derek  dchen...@amazon.com.INVALID> wrote:
> >
> > Does the same guidance apply to 3.x clusters? I read through the JIRA
> ticket linked below, along with tickets that it links to, but it's not
> clear that the new allocation algorithm is available in 3.x or if there are
> other reasons that this would be problematic.
> >
> > Thanks,
> >
> > Derek
> >
> > On 1/29/20, 9:54 AM, "Jon Haddad"  wrote:
> >
> >Ive put a lot of my previous clients on 4 tokens, all of which have
> >resulted in a major improvement.
> >
> >I wouldn't use any more than 4 except under some pretty unusual
> >circumstances.
> >
> >Jon
> >
> >On Wed, Jan 29, 2020, 11:18 AM Ben Bromhead  b...@instaclustr.com> wrote:
> >
> >> +1 to reducing the number of tokens as low as possible for availability
> >> issues. 4 lgtm
> >>
> >> On Wed, Jan 29, 2020 at 1:14 AM Dinesh Joshi 
> wrote:
> >>
> >>> Thanks for restarting this discussion Jeremy. I personally think 4 is
> a
> >>> good number as a default. I think whatever we pick, we should have
> enough
> >>> documentation for operators to make sense of the new defaults in 4.0.
> >>>
> >>> Dinesh
> >>>
>  On Jan 28, 2020, at 9:25 PM, Jeremy Hanna  jeremy.hanna1...@gmail.com>
> >>> wrote:
> 
>  I wanted to start a discussion about the default for num_tokens that
> >>> we'd like for people starting in Cassandra 4.0.  This is for ticket
> >>> CASSANDRA-13701 
>
> >>> (which has been duplicated a number of times, most recently by me).
> 
>  TLDR, based on availability concerns, skew concerns, operational
> >>> concerns, and based on the fact that the new allocation algorithm can
> be
> >>> configured fairly simply now, this is a proposal to go with 4 as the
> new
> >>> default and the allocate_tokens_for_local_replication_factor set to 3.
> >>> That gives a good experience out of the box for people and is the most
> >>> conservative.  It does assume that racks and DCs have been configured
> >>> correctly.  We would, of course, go into some detail in the NEWS.txt.
> 
>  Joey Lynch and Josh Snyder did an extensive analysis of availability
> >>> concerns with high num_tokens/virtual nodes in their paper <
> >>>
> >>
> http://mail-archives.apache.org/mod_mbox/cassandra-dev/201804.mbox/%3CCALShVHcz5PixXFO_4bZZZNnKcrpph-=5QmCyb0M=w-mhdyl...@mail.gmail.com%3E
> >>> .
> >>> This worsens as clusters grow larger.  I won't quote the paper here
> but
> >> in
> >>> order to have a conservative default and with the accompanying new
> >>> allocation algorithm, I think it makes sense as a default.
> 
>  The difficulties have always been that virtual nodes have been
> >>> beneficial for operations but that 256 is too high for the purposes of
> >>> repair and as Joey and Josh cover, for availability.  Going lower with
> >> the
> >>> original allocation algorithm has produced skew in allocation in its
> >> naive
> >>> distribution.  Enter CASSANDRA-7032 <
> >>> https://issues.apache.org/jira/browse/CASSANDRA-7032> and the new
> token
> >>> allocation algorithm.  CASSANDRA-15260 <
> >>> https://issues.apache.org/jira/browse/CASSANDRA-15260> makes the new
> >>> algorithm operationally simpler.
> 
>  One other item of note - since Joey and Josh's analysis, there have
> >> been
> >>> improvements in streaming and other considerations that can reduce the
> >>> probability of more than one node representing some token range being
> >>> unavailable, but it would still be good to be conservative.
> 
>  Please chime in with any concerns with having num_tokens=4 and
> >>> 

Re: [Discuss] num_tokens default in Cassandra 4.0

2020-01-30 Thread onmstester onmstester
Shouldn't we consider the cluster size to configure num_tokens? 

For example is it OK to use num_tokens=4 for a cluster of more than 100 of 
nodes?



Another question that is not so much relevant to this :

When we use the token assignment algorithm (the new/non-random one) for a 
specific keyspace, why should we use initial token for all the seeds, isn't one 
seed enough and then just set the keyspace for all other nodes?



Also i do not understand why should we consider rack topology and number of 
racks for configuration of num_tokens?



Sent using https://www.zoho.com/mail/




 On Thu, 30 Jan 2020 04:33:57 +0330 Jeremy Hanna 
 wrote 


The new default wouldn't be retroactively set for 3.x, but the same principles 
apply.  The new algorithm is in 3.x as well as the simplification of the 
configuration.  So no reason not to use the same configuration on 3.x. 
 
> On Jan 30, 2020, at 4:34 AM, Chen-Becker, Derek 
>  wrote: 
> 
> Does the same guidance apply to 3.x clusters? I read through the JIRA ticket 
> linked below, along with tickets that it links to, but it's not clear that 
> the new allocation algorithm is available in 3.x or if there are other 
> reasons that this would be problematic. 
> 
> Thanks, 
> 
> Derek 
> 
> On 1/29/20, 9:54 AM, "Jon Haddad"  wrote: 
> 
>Ive put a lot of my previous clients on 4 tokens, all of which have 
>resulted in a major improvement. 
> 
>I wouldn't use any more than 4 except under some pretty unusual 
>circumstances. 
> 
>Jon 
> 
>On Wed, Jan 29, 2020, 11:18 AM Ben Bromhead  
> wrote: 
> 
>> +1 to reducing the number of tokens as low as possible for availability 
>> issues. 4 lgtm 
>> 
>> On Wed, Jan 29, 2020 at 1:14 AM Dinesh Joshi  
>> wrote: 
>> 
>>> Thanks for restarting this discussion Jeremy. I personally think 4 is a 
>>> good number as a default. I think whatever we pick, we should have enough 
>>> documentation for operators to make sense of the new defaults in 4.0. 
>>> 
>>> Dinesh 
>>> 
 On Jan 28, 2020, at 9:25 PM, Jeremy Hanna 
  
>>> wrote: 
 
 I wanted to start a discussion about the default for num_tokens that 
>>> we'd like for people starting in Cassandra 4.0.  This is for ticket 
>>> CASSANDRA-13701  
>>> (which has been duplicated a number of times, most recently by me). 
 
 TLDR, based on availability concerns, skew concerns, operational 
>>> concerns, and based on the fact that the new allocation algorithm can be 
>>> configured fairly simply now, this is a proposal to go with 4 as the new 
>>> default and the allocate_tokens_for_local_replication_factor set to 3. 
>>> That gives a good experience out of the box for people and is the most 
>>> conservative.  It does assume that racks and DCs have been configured 
>>> correctly.  We would, of course, go into some detail in the NEWS.txt. 
 
 Joey Lynch and Josh Snyder did an extensive analysis of availability 
>>> concerns with high num_tokens/virtual nodes in their paper < 
>>> 
>> http://mail-archives.apache.org/mod_mbox/cassandra-dev/201804.mbox/%3CCALShVHcz5PixXFO_4bZZZNnKcrpph-=5QmCyb0M=w-mhdyl...@mail.gmail.com%3E
>>  
>>> . 
>>> This worsens as clusters grow larger.  I won't quote the paper here but 
>> in 
>>> order to have a conservative default and with the accompanying new 
>>> allocation algorithm, I think it makes sense as a default. 
 
 The difficulties have always been that virtual nodes have been 
>>> beneficial for operations but that 256 is too high for the purposes of 
>>> repair and as Joey and Josh cover, for availability.  Going lower with 
>> the 
>>> original allocation algorithm has produced skew in allocation in its 
>> naive 
>>> distribution.  Enter CASSANDRA-7032 < 
>>> https://issues.apache.org/jira/browse/CASSANDRA-7032> and the new token 
>>> allocation algorithm.  CASSANDRA-15260 < 
>>> https://issues.apache.org/jira/browse/CASSANDRA-15260> makes the new 
>>> algorithm operationally simpler. 
 
 One other item of note - since Joey and Josh's analysis, there have 
>> been 
>>> improvements in streaming and other considerations that can reduce the 
>>> probability of more than one node representing some token range being 
>>> unavailable, but it would still be good to be conservative. 
 
 Please chime in with any concerns with having num_tokens=4 and 
>>> allocate_tokens_for_local_replication_factor=3 and the accompanying 
>>> rationale so we can improve the experience for all users. 
 
 Other resources: 
 
>>> 
>> https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html
>>  
 
>>> 
>> https://docs.datastax.com/en/dse/6.7/dse-admin/datastax_enterprise/config/configVnodes.html
>>  
 
>>> 
>> 

Re: [Discuss] num_tokens default in Cassandra 4.0

2020-01-29 Thread Jeremy Hanna
The new default wouldn't be retroactively set for 3.x, but the same principles 
apply.  The new algorithm is in 3.x as well as the simplification of the 
configuration.  So no reason not to use the same configuration on 3.x.

> On Jan 30, 2020, at 4:34 AM, Chen-Becker, Derek  
> wrote:
> 
> Does the same guidance apply to 3.x clusters? I read through the JIRA ticket 
> linked below, along with tickets that it links to, but it's not clear that 
> the new allocation algorithm is available in 3.x or if there are other 
> reasons that this would be problematic.
> 
> Thanks,
> 
> Derek
> 
> On 1/29/20, 9:54 AM, "Jon Haddad"  wrote:
> 
>Ive put a lot of my previous clients on 4 tokens, all of which have
>resulted in a major improvement.
> 
>I wouldn't use any more than 4 except under some pretty unusual
>circumstances.
> 
>Jon
> 
>On Wed, Jan 29, 2020, 11:18 AM Ben Bromhead  wrote:
> 
>> +1 to reducing the number of tokens as low as possible for availability
>> issues. 4 lgtm
>> 
>> On Wed, Jan 29, 2020 at 1:14 AM Dinesh Joshi  wrote:
>> 
>>> Thanks for restarting this discussion Jeremy. I personally think 4 is a
>>> good number as a default. I think whatever we pick, we should have enough
>>> documentation for operators to make sense of the new defaults in 4.0.
>>> 
>>> Dinesh
>>> 
 On Jan 28, 2020, at 9:25 PM, Jeremy Hanna 
>>> wrote:
 
 I wanted to start a discussion about the default for num_tokens that
>>> we'd like for people starting in Cassandra 4.0.  This is for ticket
>>> CASSANDRA-13701 
>>> (which has been duplicated a number of times, most recently by me).
 
 TLDR, based on availability concerns, skew concerns, operational
>>> concerns, and based on the fact that the new allocation algorithm can be
>>> configured fairly simply now, this is a proposal to go with 4 as the new
>>> default and the allocate_tokens_for_local_replication_factor set to 3.
>>> That gives a good experience out of the box for people and is the most
>>> conservative.  It does assume that racks and DCs have been configured
>>> correctly.  We would, of course, go into some detail in the NEWS.txt.
 
 Joey Lynch and Josh Snyder did an extensive analysis of availability
>>> concerns with high num_tokens/virtual nodes in their paper <
>>> 
>> http://mail-archives.apache.org/mod_mbox/cassandra-dev/201804.mbox/%3CCALShVHcz5PixXFO_4bZZZNnKcrpph-=5QmCyb0M=w-mhdyl...@mail.gmail.com%3E
>>> .
>>> This worsens as clusters grow larger.  I won't quote the paper here but
>> in
>>> order to have a conservative default and with the accompanying new
>>> allocation algorithm, I think it makes sense as a default.
 
 The difficulties have always been that virtual nodes have been
>>> beneficial for operations but that 256 is too high for the purposes of
>>> repair and as Joey and Josh cover, for availability.  Going lower with
>> the
>>> original allocation algorithm has produced skew in allocation in its
>> naive
>>> distribution.  Enter CASSANDRA-7032 <
>>> https://issues.apache.org/jira/browse/CASSANDRA-7032> and the new token
>>> allocation algorithm.  CASSANDRA-15260 <
>>> https://issues.apache.org/jira/browse/CASSANDRA-15260> makes the new
>>> algorithm operationally simpler.
 
 One other item of note - since Joey and Josh's analysis, there have
>> been
>>> improvements in streaming and other considerations that can reduce the
>>> probability of more than one node representing some token range being
>>> unavailable, but it would still be good to be conservative.
 
 Please chime in with any concerns with having num_tokens=4 and
>>> allocate_tokens_for_local_replication_factor=3 and the accompanying
>>> rationale so we can improve the experience for all users.
 
 Other resources:
 
>>> 
>> https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html
 
>>> 
>> https://docs.datastax.com/en/dse/6.7/dse-admin/datastax_enterprise/config/configVnodes.html
 
>>> 
>> https://www.datastax.com/blog/2016/01/new-token-allocation-algorithm-cassandra-30
 
>>> 
>>> 
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>> 
>>> 
>> 
>> --
>> 
>> Ben Bromhead
>> 
>> Instaclustr | www.instaclustr.com | @instaclustr
>>  | (650) 284 9692
>> 
> 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [Discuss] num_tokens default in Cassandra 4.0

2020-01-29 Thread Chen-Becker, Derek
Does the same guidance apply to 3.x clusters? I read through the JIRA ticket 
linked below, along with tickets that it links to, but it's not clear that the 
new allocation algorithm is available in 3.x or if there are other reasons that 
this would be problematic.

Thanks,

Derek

On 1/29/20, 9:54 AM, "Jon Haddad"  wrote:

Ive put a lot of my previous clients on 4 tokens, all of which have
resulted in a major improvement.

I wouldn't use any more than 4 except under some pretty unusual
circumstances.

Jon

On Wed, Jan 29, 2020, 11:18 AM Ben Bromhead  wrote:

> +1 to reducing the number of tokens as low as possible for availability
> issues. 4 lgtm
>
> On Wed, Jan 29, 2020 at 1:14 AM Dinesh Joshi  wrote:
>
> > Thanks for restarting this discussion Jeremy. I personally think 4 is a
> > good number as a default. I think whatever we pick, we should have 
enough
> > documentation for operators to make sense of the new defaults in 4.0.
> >
> > Dinesh
> >
> > > On Jan 28, 2020, at 9:25 PM, Jeremy Hanna 
> > wrote:
> > >
> > > I wanted to start a discussion about the default for num_tokens that
> > we'd like for people starting in Cassandra 4.0.  This is for ticket
> > CASSANDRA-13701 
> > (which has been duplicated a number of times, most recently by me).
> > >
> > > TLDR, based on availability concerns, skew concerns, operational
> > concerns, and based on the fact that the new allocation algorithm can be
> > configured fairly simply now, this is a proposal to go with 4 as the new
> > default and the allocate_tokens_for_local_replication_factor set to 3.
> > That gives a good experience out of the box for people and is the most
> > conservative.  It does assume that racks and DCs have been configured
> > correctly.  We would, of course, go into some detail in the NEWS.txt.
> > >
> > > Joey Lynch and Josh Snyder did an extensive analysis of availability
> > concerns with high num_tokens/virtual nodes in their paper <
> >
> 
http://mail-archives.apache.org/mod_mbox/cassandra-dev/201804.mbox/%3CCALShVHcz5PixXFO_4bZZZNnKcrpph-=5QmCyb0M=w-mhdyl...@mail.gmail.com%3E
> >.
> > This worsens as clusters grow larger.  I won't quote the paper here but
> in
> > order to have a conservative default and with the accompanying new
> > allocation algorithm, I think it makes sense as a default.
> > >
> > > The difficulties have always been that virtual nodes have been
> > beneficial for operations but that 256 is too high for the purposes of
> > repair and as Joey and Josh cover, for availability.  Going lower with
> the
> > original allocation algorithm has produced skew in allocation in its
> naive
> > distribution.  Enter CASSANDRA-7032 <
> > https://issues.apache.org/jira/browse/CASSANDRA-7032> and the new token
> > allocation algorithm.  CASSANDRA-15260 <
> > https://issues.apache.org/jira/browse/CASSANDRA-15260> makes the new
> > algorithm operationally simpler.
> > >
> > > One other item of note - since Joey and Josh's analysis, there have
> been
> > improvements in streaming and other considerations that can reduce the
> > probability of more than one node representing some token range being
> > unavailable, but it would still be good to be conservative.
> > >
> > > Please chime in with any concerns with having num_tokens=4 and
> > allocate_tokens_for_local_replication_factor=3 and the accompanying
> > rationale so we can improve the experience for all users.
> > >
> > > Other resources:
> > >
> >
> 
https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html
> > >
> >
> 
https://docs.datastax.com/en/dse/6.7/dse-admin/datastax_enterprise/config/configVnodes.html
> > >
> >
> 
https://www.datastax.com/blog/2016/01/new-token-allocation-algorithm-cassandra-30
> > >
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>
> --
>
> Ben Bromhead
>
> Instaclustr | www.instaclustr.com | @instaclustr
>  | (650) 284 9692
>



-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [Discuss] num_tokens default in Cassandra 4.0

2020-01-29 Thread Jon Haddad
Ive put a lot of my previous clients on 4 tokens, all of which have
resulted in a major improvement.

I wouldn't use any more than 4 except under some pretty unusual
circumstances.

Jon

On Wed, Jan 29, 2020, 11:18 AM Ben Bromhead  wrote:

> +1 to reducing the number of tokens as low as possible for availability
> issues. 4 lgtm
>
> On Wed, Jan 29, 2020 at 1:14 AM Dinesh Joshi  wrote:
>
> > Thanks for restarting this discussion Jeremy. I personally think 4 is a
> > good number as a default. I think whatever we pick, we should have enough
> > documentation for operators to make sense of the new defaults in 4.0.
> >
> > Dinesh
> >
> > > On Jan 28, 2020, at 9:25 PM, Jeremy Hanna 
> > wrote:
> > >
> > > I wanted to start a discussion about the default for num_tokens that
> > we'd like for people starting in Cassandra 4.0.  This is for ticket
> > CASSANDRA-13701 
> > (which has been duplicated a number of times, most recently by me).
> > >
> > > TLDR, based on availability concerns, skew concerns, operational
> > concerns, and based on the fact that the new allocation algorithm can be
> > configured fairly simply now, this is a proposal to go with 4 as the new
> > default and the allocate_tokens_for_local_replication_factor set to 3.
> > That gives a good experience out of the box for people and is the most
> > conservative.  It does assume that racks and DCs have been configured
> > correctly.  We would, of course, go into some detail in the NEWS.txt.
> > >
> > > Joey Lynch and Josh Snyder did an extensive analysis of availability
> > concerns with high num_tokens/virtual nodes in their paper <
> >
> http://mail-archives.apache.org/mod_mbox/cassandra-dev/201804.mbox/%3CCALShVHcz5PixXFO_4bZZZNnKcrpph-=5QmCyb0M=w-mhdyl...@mail.gmail.com%3E
> >.
> > This worsens as clusters grow larger.  I won't quote the paper here but
> in
> > order to have a conservative default and with the accompanying new
> > allocation algorithm, I think it makes sense as a default.
> > >
> > > The difficulties have always been that virtual nodes have been
> > beneficial for operations but that 256 is too high for the purposes of
> > repair and as Joey and Josh cover, for availability.  Going lower with
> the
> > original allocation algorithm has produced skew in allocation in its
> naive
> > distribution.  Enter CASSANDRA-7032 <
> > https://issues.apache.org/jira/browse/CASSANDRA-7032> and the new token
> > allocation algorithm.  CASSANDRA-15260 <
> > https://issues.apache.org/jira/browse/CASSANDRA-15260> makes the new
> > algorithm operationally simpler.
> > >
> > > One other item of note - since Joey and Josh's analysis, there have
> been
> > improvements in streaming and other considerations that can reduce the
> > probability of more than one node representing some token range being
> > unavailable, but it would still be good to be conservative.
> > >
> > > Please chime in with any concerns with having num_tokens=4 and
> > allocate_tokens_for_local_replication_factor=3 and the accompanying
> > rationale so we can improve the experience for all users.
> > >
> > > Other resources:
> > >
> >
> https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html
> > >
> >
> https://docs.datastax.com/en/dse/6.7/dse-admin/datastax_enterprise/config/configVnodes.html
> > >
> >
> https://www.datastax.com/blog/2016/01/new-token-allocation-algorithm-cassandra-30
> > >
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>
> --
>
> Ben Bromhead
>
> Instaclustr | www.instaclustr.com | @instaclustr
>  | (650) 284 9692
>


Re: [Discuss] num_tokens default in Cassandra 4.0

2020-01-29 Thread Ben Bromhead
+1 to reducing the number of tokens as low as possible for availability
issues. 4 lgtm

On Wed, Jan 29, 2020 at 1:14 AM Dinesh Joshi  wrote:

> Thanks for restarting this discussion Jeremy. I personally think 4 is a
> good number as a default. I think whatever we pick, we should have enough
> documentation for operators to make sense of the new defaults in 4.0.
>
> Dinesh
>
> > On Jan 28, 2020, at 9:25 PM, Jeremy Hanna 
> wrote:
> >
> > I wanted to start a discussion about the default for num_tokens that
> we'd like for people starting in Cassandra 4.0.  This is for ticket
> CASSANDRA-13701 
> (which has been duplicated a number of times, most recently by me).
> >
> > TLDR, based on availability concerns, skew concerns, operational
> concerns, and based on the fact that the new allocation algorithm can be
> configured fairly simply now, this is a proposal to go with 4 as the new
> default and the allocate_tokens_for_local_replication_factor set to 3.
> That gives a good experience out of the box for people and is the most
> conservative.  It does assume that racks and DCs have been configured
> correctly.  We would, of course, go into some detail in the NEWS.txt.
> >
> > Joey Lynch and Josh Snyder did an extensive analysis of availability
> concerns with high num_tokens/virtual nodes in their paper <
> http://mail-archives.apache.org/mod_mbox/cassandra-dev/201804.mbox/%3CCALShVHcz5PixXFO_4bZZZNnKcrpph-=5QmCyb0M=w-mhdyl...@mail.gmail.com%3E>.
> This worsens as clusters grow larger.  I won't quote the paper here but in
> order to have a conservative default and with the accompanying new
> allocation algorithm, I think it makes sense as a default.
> >
> > The difficulties have always been that virtual nodes have been
> beneficial for operations but that 256 is too high for the purposes of
> repair and as Joey and Josh cover, for availability.  Going lower with the
> original allocation algorithm has produced skew in allocation in its naive
> distribution.  Enter CASSANDRA-7032 <
> https://issues.apache.org/jira/browse/CASSANDRA-7032> and the new token
> allocation algorithm.  CASSANDRA-15260 <
> https://issues.apache.org/jira/browse/CASSANDRA-15260> makes the new
> algorithm operationally simpler.
> >
> > One other item of note - since Joey and Josh's analysis, there have been
> improvements in streaming and other considerations that can reduce the
> probability of more than one node representing some token range being
> unavailable, but it would still be good to be conservative.
> >
> > Please chime in with any concerns with having num_tokens=4 and
> allocate_tokens_for_local_replication_factor=3 and the accompanying
> rationale so we can improve the experience for all users.
> >
> > Other resources:
> >
> https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html
> >
> https://docs.datastax.com/en/dse/6.7/dse-admin/datastax_enterprise/config/configVnodes.html
> >
> https://www.datastax.com/blog/2016/01/new-token-allocation-algorithm-cassandra-30
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
 | (650) 284 9692


Re: [Discuss] num_tokens default in Cassandra 4.0

2020-01-28 Thread Dinesh Joshi
Thanks for restarting this discussion Jeremy. I personally think 4 is a good 
number as a default. I think whatever we pick, we should have enough 
documentation for operators to make sense of the new defaults in 4.0. 

Dinesh

> On Jan 28, 2020, at 9:25 PM, Jeremy Hanna  wrote:
> 
> I wanted to start a discussion about the default for num_tokens that we'd 
> like for people starting in Cassandra 4.0.  This is for ticket 
> CASSANDRA-13701  
> (which has been duplicated a number of times, most recently by me).
> 
> TLDR, based on availability concerns, skew concerns, operational concerns, 
> and based on the fact that the new allocation algorithm can be configured 
> fairly simply now, this is a proposal to go with 4 as the new default and the 
> allocate_tokens_for_local_replication_factor set to 3.  That gives a good 
> experience out of the box for people and is the most conservative.  It does 
> assume that racks and DCs have been configured correctly.  We would, of 
> course, go into some detail in the NEWS.txt.
> 
> Joey Lynch and Josh Snyder did an extensive analysis of availability concerns 
> with high num_tokens/virtual nodes in their paper 
> .
>   This worsens as clusters grow larger.  I won't quote the paper here but in 
> order to have a conservative default and with the accompanying new allocation 
> algorithm, I think it makes sense as a default.
> 
> The difficulties have always been that virtual nodes have been beneficial for 
> operations but that 256 is too high for the purposes of repair and as Joey 
> and Josh cover, for availability.  Going lower with the original allocation 
> algorithm has produced skew in allocation in its naive distribution.  Enter 
> CASSANDRA-7032  and the 
> new token allocation algorithm.  CASSANDRA-15260 
>  makes the new 
> algorithm operationally simpler.
> 
> One other item of note - since Joey and Josh's analysis, there have been 
> improvements in streaming and other considerations that can reduce the 
> probability of more than one node representing some token range being 
> unavailable, but it would still be good to be conservative.
> 
> Please chime in with any concerns with having num_tokens=4 and 
> allocate_tokens_for_local_replication_factor=3 and the accompanying rationale 
> so we can improve the experience for all users.
> 
> Other resources:
> https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html
> https://docs.datastax.com/en/dse/6.7/dse-admin/datastax_enterprise/config/configVnodes.html
> https://www.datastax.com/blog/2016/01/new-token-allocation-algorithm-cassandra-30
> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



[Discuss] num_tokens default in Cassandra 4.0

2020-01-28 Thread Jeremy Hanna
I wanted to start a discussion about the default for num_tokens that we'd like 
for people starting in Cassandra 4.0.  This is for ticket CASSANDRA-13701 
 (which has been 
duplicated a number of times, most recently by me).

TLDR, based on availability concerns, skew concerns, operational concerns, and 
based on the fact that the new allocation algorithm can be configured fairly 
simply now, this is a proposal to go with 4 as the new default and the 
allocate_tokens_for_local_replication_factor set to 3.  That gives a good 
experience out of the box for people and is the most conservative.  It does 
assume that racks and DCs have been configured correctly.  We would, of course, 
go into some detail in the NEWS.txt.

Joey Lynch and Josh Snyder did an extensive analysis of availability concerns 
with high num_tokens/virtual nodes in their paper 
.
  This worsens as clusters grow larger.  I won't quote the paper here but in 
order to have a conservative default and with the accompanying new allocation 
algorithm, I think it makes sense as a default.

The difficulties have always been that virtual nodes have been beneficial for 
operations but that 256 is too high for the purposes of repair and as Joey and 
Josh cover, for availability.  Going lower with the original allocation 
algorithm has produced skew in allocation in its naive distribution.  Enter 
CASSANDRA-7032  and the 
new token allocation algorithm.  CASSANDRA-15260 
 makes the new algorithm 
operationally simpler.

One other item of note - since Joey and Josh's analysis, there have been 
improvements in streaming and other considerations that can reduce the 
probability of more than one node representing some token range being 
unavailable, but it would still be good to be conservative.

Please chime in with any concerns with having num_tokens=4 and 
allocate_tokens_for_local_replication_factor=3 and the accompanying rationale 
so we can improve the experience for all users.

Other resources:
https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html
https://docs.datastax.com/en/dse/6.7/dse-admin/datastax_enterprise/config/configVnodes.html
https://www.datastax.com/blog/2016/01/new-token-allocation-algorithm-cassandra-30