Re: [DISCUSS] changing default token behavior for 4.0

2018-09-25 Thread kurt greaves
This was exactly the kind of problem I was foreseeing. I don't see any
simple way of fixing it without introducing some shuffle-like nightmare
that does a whole bunch of token movements though. On the other hand we
could just document best practice, and also make it so that by default you
have to choose between random and the algorithm for allocation initially,
essentially the way I mentioned before. At least this way people are aware
of the advantages and disadvantages from the start, rather than everyone
just ending up with random allocation because that's what they were given.
Anyway, this isn't a simple problem so we could probably come up with
something better than that with a bit more thought

On Tue, 25 Sep 2018 at 05:43, Benedict Elliott Smith 
wrote:

> This sounds worthy of a bug report!  We should at least document any such
> inadequacy, and come up with a plan to fix it.  It would be great if you
> could file a ticket with a detailed example of the problem.
>
> > On 24 Sep 2018, at 14:57, Tom van der Woerdt <
> tom.vanderwoe...@booking.com> wrote:
> >
> > Late comment, but I'll write it anyway.
> >
> > The main advantage of random allocation over the new allocation strategy
> is
> > that it seems to be significantly better when dealing with node
> *removals*,
> > when the order of removal is not the inverse of the order of addition.
> This
> > can lead to severely unbalanced clusters when the new strategy is
> enabled.
> >
> > I tend to go with the random allocation for this reason: you can freely
> > add/remove nodes when needed, and the data distribution will remain "good
> > enough". It's only when the data density becomes high enough that the new
> > token allocation strategy really matters, imho.
> >
> > Hope that helps!
> >
> > Tom van der Woerdt
> > Site Reliability Engineer
> >
> > Booking.com B.V.
> > Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
> > [image: Booking.com] 
> > The world's #1 accommodation site
> > 43 languages, 198+ offices worldwide, 120,000+ global destinations,
> > 1,550,000+ room nights booked every day
> > No booking fees, best price always guaranteed
> > Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)
> >
> >
> > On Sat, Sep 22, 2018 at 8:12 PM Jonathan Haddad 
> wrote:
> >
> >> Is there a use case for random allocation? How does it help with
> testing? I
> >> can’t see a reason to keep it around.
> >>
> >> On Sat, Sep 22, 2018 at 3:06 AM kurt greaves 
> wrote:
> >>
> >>> +1. I've been making a case for this for some time now, and was
> actually
> >> a
> >>> focus of my talk last week. I'd be very happy to get this into 4.0.
> >>>
> >>> We've tested various num_tokens with the algorithm on various sized
> >>> clusters and we've found that typically 16 works best. With lower
> numbers
> >>> we found that balance is good initially but as a cluster gets larger
> you
> >>> have some problems. E.g We saw that on a 60 node cluster with 8 tokens
> >> per
> >>> node we were seeing a difference of 22% in token ownership, but on a
> <=12
> >>> node cluster a difference of only 12%. 16 tokens on the other hand
> wasn't
> >>> perfect but generally gave a better balance regardless of cluster size
> at
> >>> least up to 100 nodes. TBH we should probably do some proper testing
> and
> >>> record all the results for this before we pick a default (I'm happy to
> do
> >>> this - think we can use the original testing script for this).
> >>>
> >>> But anyway, I'd say Jon is on the right track. Personally how I'd like
> to
> >>> see it is that we:
> >>>
> >>>   1. Change allocate_tokens_for_keyspace to allocate_tokens_for_rf in
> >> the
> >>>   same way that DSE does it. Allowing a user to specify a RF to
> allocate
> >>>   from, and allowing multiple DC's.
> >>>   2. Add a new boolean property random_token_allocation, defaults to
> >>> false.
> >>>   3. Make allocate_tokens_for_rf default to *unset**.
> >>>   4. Make allocate_tokens_for_rf *required*** if num_tokens > 1 and
> >>>   random_token_allocation != true.
> >>>   5. Default num_tokens to 16 (or whatever we find appropriate)
> >>>
> >>> * I think setting a default is asking for trouble. When people are
> going
> >> to
> >>> add new DC's/nodes we don't want to risk them adding a node with the
> >> wrong
> >>> RF. I think it's safe to say that a user should have to think about
> this
> >>> before they spin up their cluster.
> >>> ** Following above, it should be required to be set so that we don't
> have
> >>> people accidentally using random allocation. I think we should really
> be
> >>> aiming to get rid of random allocation completely, but provide a new
> >>> property to enable it for backwards compatibility (also for testing).
> >>>
> >>> It's worth noting that a smaller number of tokens *theoretically*
> >> decreases
> >>> the time for replacement/rebuild, so if we're considering QUORUM
> >>> availability with vnodes there's an argument against having a very low
> >>> num_tokens. I 

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-25 Thread kurt greaves
Sounds good to me. I'm going to play around with the algorithm and actually
record some numbers/evidence over the next week to help us decide.

On Tue, 25 Sep 2018 at 05:38, Joseph Lynch  wrote:

> I am a big fan of lowering the default number of tokens for many
> reasons (availability, repair, etc...). I also agree there are some
> usability blockers to "just lowering the number today", but I very
> much agree that the current default of 256 random tokens is a huge bug
> I hope we fix by 4.0 release.
>
> It sounds like Kurt and Jon have done a lot of work already on this
> problem, and internally I've worked on this as well (Netflix's
> internal token allocation as well as evaluating vnodes that resulted
> in the paper I sent out) so I would be excited to help fix this for
> 4.0. Maybe the three of us (plus any others that are interested) can
> put together a short proposal over the next few days including the
> following:
>
> 1. What precisely should we change the defaults to
> 2. Given the new defaults how would a user bootstrap a new cluster
> 3. Given the new defaults how would a user add capacity to an existing
> cluster
> 4. Concrete jiras that would implement #1 with minimal possible scope
>
> Then we could send the proposal to the dev list for feedback and if
> there is consensus that the scope is not too large/dangerous and a
> committer (Jon perhaps) can commit to reviewing/merging, we can work
> on them and be accountable to merge them before the 4.0 release?
>
> -Joey
> On Sun, Sep 23, 2018 at 1:42 PM Nate McCall  wrote:
> >
> > Let's pick a default setup that works for most people (IME clusters <
> > 30 nodes, but TLP and Instaclustr peeps probably have the most insight
> > here).
> >
> > Then we just explain the heck out of it in the comments. I would also
> > like to see this include some details add/remove a DC to change the
> > values (perhaps we sub-task a doc creation for that?).
> >
> > Good discussion though - thanks folks.
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [DISCUSS] changing default token behavior for 4.0

2018-09-24 Thread Benedict Elliott Smith
This sounds worthy of a bug report!  We should at least document any such 
inadequacy, and come up with a plan to fix it.  It would be great if you could 
file a ticket with a detailed example of the problem.

> On 24 Sep 2018, at 14:57, Tom van der Woerdt  
> wrote:
> 
> Late comment, but I'll write it anyway.
> 
> The main advantage of random allocation over the new allocation strategy is
> that it seems to be significantly better when dealing with node *removals*,
> when the order of removal is not the inverse of the order of addition. This
> can lead to severely unbalanced clusters when the new strategy is enabled.
> 
> I tend to go with the random allocation for this reason: you can freely
> add/remove nodes when needed, and the data distribution will remain "good
> enough". It's only when the data density becomes high enough that the new
> token allocation strategy really matters, imho.
> 
> Hope that helps!
> 
> Tom van der Woerdt
> Site Reliability Engineer
> 
> Booking.com B.V.
> Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
> [image: Booking.com] 
> The world's #1 accommodation site
> 43 languages, 198+ offices worldwide, 120,000+ global destinations,
> 1,550,000+ room nights booked every day
> No booking fees, best price always guaranteed
> Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)
> 
> 
> On Sat, Sep 22, 2018 at 8:12 PM Jonathan Haddad  wrote:
> 
>> Is there a use case for random allocation? How does it help with testing? I
>> can’t see a reason to keep it around.
>> 
>> On Sat, Sep 22, 2018 at 3:06 AM kurt greaves  wrote:
>> 
>>> +1. I've been making a case for this for some time now, and was actually
>> a
>>> focus of my talk last week. I'd be very happy to get this into 4.0.
>>> 
>>> We've tested various num_tokens with the algorithm on various sized
>>> clusters and we've found that typically 16 works best. With lower numbers
>>> we found that balance is good initially but as a cluster gets larger you
>>> have some problems. E.g We saw that on a 60 node cluster with 8 tokens
>> per
>>> node we were seeing a difference of 22% in token ownership, but on a <=12
>>> node cluster a difference of only 12%. 16 tokens on the other hand wasn't
>>> perfect but generally gave a better balance regardless of cluster size at
>>> least up to 100 nodes. TBH we should probably do some proper testing and
>>> record all the results for this before we pick a default (I'm happy to do
>>> this - think we can use the original testing script for this).
>>> 
>>> But anyway, I'd say Jon is on the right track. Personally how I'd like to
>>> see it is that we:
>>> 
>>>   1. Change allocate_tokens_for_keyspace to allocate_tokens_for_rf in
>> the
>>>   same way that DSE does it. Allowing a user to specify a RF to allocate
>>>   from, and allowing multiple DC's.
>>>   2. Add a new boolean property random_token_allocation, defaults to
>>> false.
>>>   3. Make allocate_tokens_for_rf default to *unset**.
>>>   4. Make allocate_tokens_for_rf *required*** if num_tokens > 1 and
>>>   random_token_allocation != true.
>>>   5. Default num_tokens to 16 (or whatever we find appropriate)
>>> 
>>> * I think setting a default is asking for trouble. When people are going
>> to
>>> add new DC's/nodes we don't want to risk them adding a node with the
>> wrong
>>> RF. I think it's safe to say that a user should have to think about this
>>> before they spin up their cluster.
>>> ** Following above, it should be required to be set so that we don't have
>>> people accidentally using random allocation. I think we should really be
>>> aiming to get rid of random allocation completely, but provide a new
>>> property to enable it for backwards compatibility (also for testing).
>>> 
>>> It's worth noting that a smaller number of tokens *theoretically*
>> decreases
>>> the time for replacement/rebuild, so if we're considering QUORUM
>>> availability with vnodes there's an argument against having a very low
>>> num_tokens. I think it's better to utilise NTS and racks to reduce the
>>> chance of a QUORUM outage over banking on having a lower number of
>> tokens,
>>> as with just a low number of tokens unless you go all the way to 1 you
>> are
>>> just relying on luck that 2 nodes don't overlap. Guess what I'm saying is
>>> that I think we should be choosing a num_tokens that gives the best
>>> distribution for most cluster sizes rather than choosing one that
>>> "decreases" the probability of an outage.
>>> 
>>> Also I think we should continue using CASSANDRA-13701 to track this. TBH
>> I
>>> think in general we should be a bit better at searching for and using
>>> existing tickets...
>>> 
>>> On Sat, 22 Sep 2018 at 18:13, Stefan Podkowinski 
>> wrote:
>>> 
 There already have been some discussions on this here:
 https://issues.apache.org/jira/browse/CASSANDRA-13701
 
 The mentioned blocker there on the token allocation shouldn't exist
 anymore. Although it would be good to get more feedback 

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-24 Thread Joseph Lynch
I am a big fan of lowering the default number of tokens for many
reasons (availability, repair, etc...). I also agree there are some
usability blockers to "just lowering the number today", but I very
much agree that the current default of 256 random tokens is a huge bug
I hope we fix by 4.0 release.

It sounds like Kurt and Jon have done a lot of work already on this
problem, and internally I've worked on this as well (Netflix's
internal token allocation as well as evaluating vnodes that resulted
in the paper I sent out) so I would be excited to help fix this for
4.0. Maybe the three of us (plus any others that are interested) can
put together a short proposal over the next few days including the
following:

1. What precisely should we change the defaults to
2. Given the new defaults how would a user bootstrap a new cluster
3. Given the new defaults how would a user add capacity to an existing cluster
4. Concrete jiras that would implement #1 with minimal possible scope

Then we could send the proposal to the dev list for feedback and if
there is consensus that the scope is not too large/dangerous and a
committer (Jon perhaps) can commit to reviewing/merging, we can work
on them and be accountable to merge them before the 4.0 release?

-Joey
On Sun, Sep 23, 2018 at 1:42 PM Nate McCall  wrote:
>
> Let's pick a default setup that works for most people (IME clusters <
> 30 nodes, but TLP and Instaclustr peeps probably have the most insight
> here).
>
> Then we just explain the heck out of it in the comments. I would also
> like to see this include some details add/remove a DC to change the
> values (perhaps we sub-task a doc creation for that?).
>
> Good discussion though - thanks folks.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] changing default token behavior for 4.0

2018-09-24 Thread Tom van der Woerdt
Late comment, but I'll write it anyway.

The main advantage of random allocation over the new allocation strategy is
that it seems to be significantly better when dealing with node *removals*,
when the order of removal is not the inverse of the order of addition. This
can lead to severely unbalanced clusters when the new strategy is enabled.

I tend to go with the random allocation for this reason: you can freely
add/remove nodes when needed, and the data distribution will remain "good
enough". It's only when the data density becomes high enough that the new
token allocation strategy really matters, imho.

Hope that helps!

Tom van der Woerdt
Site Reliability Engineer

Booking.com B.V.
Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
[image: Booking.com] 
The world's #1 accommodation site
43 languages, 198+ offices worldwide, 120,000+ global destinations,
1,550,000+ room nights booked every day
No booking fees, best price always guaranteed
Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)


On Sat, Sep 22, 2018 at 8:12 PM Jonathan Haddad  wrote:

> Is there a use case for random allocation? How does it help with testing? I
> can’t see a reason to keep it around.
>
> On Sat, Sep 22, 2018 at 3:06 AM kurt greaves  wrote:
>
> > +1. I've been making a case for this for some time now, and was actually
> a
> > focus of my talk last week. I'd be very happy to get this into 4.0.
> >
> > We've tested various num_tokens with the algorithm on various sized
> > clusters and we've found that typically 16 works best. With lower numbers
> > we found that balance is good initially but as a cluster gets larger you
> > have some problems. E.g We saw that on a 60 node cluster with 8 tokens
> per
> > node we were seeing a difference of 22% in token ownership, but on a <=12
> > node cluster a difference of only 12%. 16 tokens on the other hand wasn't
> > perfect but generally gave a better balance regardless of cluster size at
> > least up to 100 nodes. TBH we should probably do some proper testing and
> > record all the results for this before we pick a default (I'm happy to do
> > this - think we can use the original testing script for this).
> >
> > But anyway, I'd say Jon is on the right track. Personally how I'd like to
> > see it is that we:
> >
> >1. Change allocate_tokens_for_keyspace to allocate_tokens_for_rf in
> the
> >same way that DSE does it. Allowing a user to specify a RF to allocate
> >from, and allowing multiple DC's.
> >2. Add a new boolean property random_token_allocation, defaults to
> > false.
> >3. Make allocate_tokens_for_rf default to *unset**.
> >4. Make allocate_tokens_for_rf *required*** if num_tokens > 1 and
> >random_token_allocation != true.
> >5. Default num_tokens to 16 (or whatever we find appropriate)
> >
> > * I think setting a default is asking for trouble. When people are going
> to
> > add new DC's/nodes we don't want to risk them adding a node with the
> wrong
> > RF. I think it's safe to say that a user should have to think about this
> > before they spin up their cluster.
> > ** Following above, it should be required to be set so that we don't have
> > people accidentally using random allocation. I think we should really be
> > aiming to get rid of random allocation completely, but provide a new
> > property to enable it for backwards compatibility (also for testing).
> >
> > It's worth noting that a smaller number of tokens *theoretically*
> decreases
> > the time for replacement/rebuild, so if we're considering QUORUM
> > availability with vnodes there's an argument against having a very low
> > num_tokens. I think it's better to utilise NTS and racks to reduce the
> > chance of a QUORUM outage over banking on having a lower number of
> tokens,
> > as with just a low number of tokens unless you go all the way to 1 you
> are
> > just relying on luck that 2 nodes don't overlap. Guess what I'm saying is
> > that I think we should be choosing a num_tokens that gives the best
> > distribution for most cluster sizes rather than choosing one that
> > "decreases" the probability of an outage.
> >
> > Also I think we should continue using CASSANDRA-13701 to track this. TBH
> I
> > think in general we should be a bit better at searching for and using
> > existing tickets...
> >
> > On Sat, 22 Sep 2018 at 18:13, Stefan Podkowinski 
> wrote:
> >
> > > There already have been some discussions on this here:
> > > https://issues.apache.org/jira/browse/CASSANDRA-13701
> > >
> > > The mentioned blocker there on the token allocation shouldn't exist
> > > anymore. Although it would be good to get more feedback on it, in case
> > > we want to enable it by default, along with new defaults for number of
> > > tokens.
> > >
> > >
> > > On 22.09.18 06:30, Dinesh Joshi wrote:
> > > > Jon, thanks for starting this thread!
> > > >
> > > > I have created CASSANDRA-14784 to track this.
> > > >
> > > > Dinesh
> > > >
> > > >> On Sep 21, 2018, at 9:18 PM, 

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-23 Thread Nate McCall
Let's pick a default setup that works for most people (IME clusters <
30 nodes, but TLP and Instaclustr peeps probably have the most insight
here).

Then we just explain the heck out of it in the comments. I would also
like to see this include some details add/remove a DC to change the
values (perhaps we sub-task a doc creation for that?).

Good discussion though - thanks folks.

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] changing default token behavior for 4.0

2018-09-22 Thread kurt greaves
Only that it makes it easier to spin up a cluster.

I'm for removing it entirely as well, however I think we should keep it
around at least until the next major just as a safety precaution until the
algorithm is properly battle tested.

This is not a strongly held opinion though, I'm just foreseeing the "new
defaults don't work for my edge case" problem.

On Sun., 23 Sep. 2018, 04:12 Jonathan Haddad,  wrote:

> Is there a use case for random allocation? How does it help with testing? I
> can’t see a reason to keep it around.
>
> On Sat, Sep 22, 2018 at 3:06 AM kurt greaves  wrote:
>
> > +1. I've been making a case for this for some time now, and was actually
> a
> > focus of my talk last week. I'd be very happy to get this into 4.0.
> >
> > We've tested various num_tokens with the algorithm on various sized
> > clusters and we've found that typically 16 works best. With lower numbers
> > we found that balance is good initially but as a cluster gets larger you
> > have some problems. E.g We saw that on a 60 node cluster with 8 tokens
> per
> > node we were seeing a difference of 22% in token ownership, but on a <=12
> > node cluster a difference of only 12%. 16 tokens on the other hand wasn't
> > perfect but generally gave a better balance regardless of cluster size at
> > least up to 100 nodes. TBH we should probably do some proper testing and
> > record all the results for this before we pick a default (I'm happy to do
> > this - think we can use the original testing script for this).
> >
> > But anyway, I'd say Jon is on the right track. Personally how I'd like to
> > see it is that we:
> >
> >1. Change allocate_tokens_for_keyspace to allocate_tokens_for_rf in
> the
> >same way that DSE does it. Allowing a user to specify a RF to allocate
> >from, and allowing multiple DC's.
> >2. Add a new boolean property random_token_allocation, defaults to
> > false.
> >3. Make allocate_tokens_for_rf default to *unset**.
> >4. Make allocate_tokens_for_rf *required*** if num_tokens > 1 and
> >random_token_allocation != true.
> >5. Default num_tokens to 16 (or whatever we find appropriate)
> >
> > * I think setting a default is asking for trouble. When people are going
> to
> > add new DC's/nodes we don't want to risk them adding a node with the
> wrong
> > RF. I think it's safe to say that a user should have to think about this
> > before they spin up their cluster.
> > ** Following above, it should be required to be set so that we don't have
> > people accidentally using random allocation. I think we should really be
> > aiming to get rid of random allocation completely, but provide a new
> > property to enable it for backwards compatibility (also for testing).
> >
> > It's worth noting that a smaller number of tokens *theoretically*
> decreases
> > the time for replacement/rebuild, so if we're considering QUORUM
> > availability with vnodes there's an argument against having a very low
> > num_tokens. I think it's better to utilise NTS and racks to reduce the
> > chance of a QUORUM outage over banking on having a lower number of
> tokens,
> > as with just a low number of tokens unless you go all the way to 1 you
> are
> > just relying on luck that 2 nodes don't overlap. Guess what I'm saying is
> > that I think we should be choosing a num_tokens that gives the best
> > distribution for most cluster sizes rather than choosing one that
> > "decreases" the probability of an outage.
> >
> > Also I think we should continue using CASSANDRA-13701 to track this. TBH
> I
> > think in general we should be a bit better at searching for and using
> > existing tickets...
> >
> > On Sat, 22 Sep 2018 at 18:13, Stefan Podkowinski 
> wrote:
> >
> > > There already have been some discussions on this here:
> > > https://issues.apache.org/jira/browse/CASSANDRA-13701
> > >
> > > The mentioned blocker there on the token allocation shouldn't exist
> > > anymore. Although it would be good to get more feedback on it, in case
> > > we want to enable it by default, along with new defaults for number of
> > > tokens.
> > >
> > >
> > > On 22.09.18 06:30, Dinesh Joshi wrote:
> > > > Jon, thanks for starting this thread!
> > > >
> > > > I have created CASSANDRA-14784 to track this.
> > > >
> > > > Dinesh
> > > >
> > > >> On Sep 21, 2018, at 9:18 PM, Sankalp Kohli 
> > > wrote:
> > > >>
> > > >> Putting it on JIRA is to make sure someone is assigned to it and it
> is
> > > tracked. Changes should be discussed over ML like you are saying.
> > > >>
> > > >> On Sep 21, 2018, at 21:02, Jonathan Haddad 
> wrote:
> > > >>
> > >  We should create a JIRA to find what other defaults we need
> revisit.
> > > >>> Changing a default is a pretty big deal, I think we should discuss
> > any
> > > >>> changes to defaults here on the ML before moving it into JIRA.
> It's
> > > nice
> > > >>> to get a bit more discussion around the change than what happens in
> > > JIRA.
> > > >>>
> > > >>> We (TLP) did some testing on 4 tokens 

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-22 Thread Jonathan Haddad
Is there a use case for random allocation? How does it help with testing? I
can’t see a reason to keep it around.

On Sat, Sep 22, 2018 at 3:06 AM kurt greaves  wrote:

> +1. I've been making a case for this for some time now, and was actually a
> focus of my talk last week. I'd be very happy to get this into 4.0.
>
> We've tested various num_tokens with the algorithm on various sized
> clusters and we've found that typically 16 works best. With lower numbers
> we found that balance is good initially but as a cluster gets larger you
> have some problems. E.g We saw that on a 60 node cluster with 8 tokens per
> node we were seeing a difference of 22% in token ownership, but on a <=12
> node cluster a difference of only 12%. 16 tokens on the other hand wasn't
> perfect but generally gave a better balance regardless of cluster size at
> least up to 100 nodes. TBH we should probably do some proper testing and
> record all the results for this before we pick a default (I'm happy to do
> this - think we can use the original testing script for this).
>
> But anyway, I'd say Jon is on the right track. Personally how I'd like to
> see it is that we:
>
>1. Change allocate_tokens_for_keyspace to allocate_tokens_for_rf in the
>same way that DSE does it. Allowing a user to specify a RF to allocate
>from, and allowing multiple DC's.
>2. Add a new boolean property random_token_allocation, defaults to
> false.
>3. Make allocate_tokens_for_rf default to *unset**.
>4. Make allocate_tokens_for_rf *required*** if num_tokens > 1 and
>random_token_allocation != true.
>5. Default num_tokens to 16 (or whatever we find appropriate)
>
> * I think setting a default is asking for trouble. When people are going to
> add new DC's/nodes we don't want to risk them adding a node with the wrong
> RF. I think it's safe to say that a user should have to think about this
> before they spin up their cluster.
> ** Following above, it should be required to be set so that we don't have
> people accidentally using random allocation. I think we should really be
> aiming to get rid of random allocation completely, but provide a new
> property to enable it for backwards compatibility (also for testing).
>
> It's worth noting that a smaller number of tokens *theoretically* decreases
> the time for replacement/rebuild, so if we're considering QUORUM
> availability with vnodes there's an argument against having a very low
> num_tokens. I think it's better to utilise NTS and racks to reduce the
> chance of a QUORUM outage over banking on having a lower number of tokens,
> as with just a low number of tokens unless you go all the way to 1 you are
> just relying on luck that 2 nodes don't overlap. Guess what I'm saying is
> that I think we should be choosing a num_tokens that gives the best
> distribution for most cluster sizes rather than choosing one that
> "decreases" the probability of an outage.
>
> Also I think we should continue using CASSANDRA-13701 to track this. TBH I
> think in general we should be a bit better at searching for and using
> existing tickets...
>
> On Sat, 22 Sep 2018 at 18:13, Stefan Podkowinski  wrote:
>
> > There already have been some discussions on this here:
> > https://issues.apache.org/jira/browse/CASSANDRA-13701
> >
> > The mentioned blocker there on the token allocation shouldn't exist
> > anymore. Although it would be good to get more feedback on it, in case
> > we want to enable it by default, along with new defaults for number of
> > tokens.
> >
> >
> > On 22.09.18 06:30, Dinesh Joshi wrote:
> > > Jon, thanks for starting this thread!
> > >
> > > I have created CASSANDRA-14784 to track this.
> > >
> > > Dinesh
> > >
> > >> On Sep 21, 2018, at 9:18 PM, Sankalp Kohli 
> > wrote:
> > >>
> > >> Putting it on JIRA is to make sure someone is assigned to it and it is
> > tracked. Changes should be discussed over ML like you are saying.
> > >>
> > >> On Sep 21, 2018, at 21:02, Jonathan Haddad  wrote:
> > >>
> >  We should create a JIRA to find what other defaults we need revisit.
> > >>> Changing a default is a pretty big deal, I think we should discuss
> any
> > >>> changes to defaults here on the ML before moving it into JIRA.  It's
> > nice
> > >>> to get a bit more discussion around the change than what happens in
> > JIRA.
> > >>>
> > >>> We (TLP) did some testing on 4 tokens and found it to work
> surprisingly
> > >>> well.   It wasn't particularly formal, but we verified the load stays
> > >>> pretty even with only 4 tokens as we added nodes to the cluster.
> > Higher
> > >>> token count hurts availability by increasing the number of nodes any
> > given
> > >>> node is a neighbor with, meaning any 2 nodes that fail have an
> > increased
> > >>> chance of downtime when using QUORUM.  In addition, with the recent
> > >>> streaming optimization it seems the token counts will give a greater
> > chance
> > >>> of a node streaming entire sstables (with LCS), meaning we'll do a
> > better

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-22 Thread kurt greaves
+1. I've been making a case for this for some time now, and was actually a
focus of my talk last week. I'd be very happy to get this into 4.0.

We've tested various num_tokens with the algorithm on various sized
clusters and we've found that typically 16 works best. With lower numbers
we found that balance is good initially but as a cluster gets larger you
have some problems. E.g We saw that on a 60 node cluster with 8 tokens per
node we were seeing a difference of 22% in token ownership, but on a <=12
node cluster a difference of only 12%. 16 tokens on the other hand wasn't
perfect but generally gave a better balance regardless of cluster size at
least up to 100 nodes. TBH we should probably do some proper testing and
record all the results for this before we pick a default (I'm happy to do
this - think we can use the original testing script for this).

But anyway, I'd say Jon is on the right track. Personally how I'd like to
see it is that we:

   1. Change allocate_tokens_for_keyspace to allocate_tokens_for_rf in the
   same way that DSE does it. Allowing a user to specify a RF to allocate
   from, and allowing multiple DC's.
   2. Add a new boolean property random_token_allocation, defaults to false.
   3. Make allocate_tokens_for_rf default to *unset**.
   4. Make allocate_tokens_for_rf *required*** if num_tokens > 1 and
   random_token_allocation != true.
   5. Default num_tokens to 16 (or whatever we find appropriate)

* I think setting a default is asking for trouble. When people are going to
add new DC's/nodes we don't want to risk them adding a node with the wrong
RF. I think it's safe to say that a user should have to think about this
before they spin up their cluster.
** Following above, it should be required to be set so that we don't have
people accidentally using random allocation. I think we should really be
aiming to get rid of random allocation completely, but provide a new
property to enable it for backwards compatibility (also for testing).

It's worth noting that a smaller number of tokens *theoretically* decreases
the time for replacement/rebuild, so if we're considering QUORUM
availability with vnodes there's an argument against having a very low
num_tokens. I think it's better to utilise NTS and racks to reduce the
chance of a QUORUM outage over banking on having a lower number of tokens,
as with just a low number of tokens unless you go all the way to 1 you are
just relying on luck that 2 nodes don't overlap. Guess what I'm saying is
that I think we should be choosing a num_tokens that gives the best
distribution for most cluster sizes rather than choosing one that
"decreases" the probability of an outage.

Also I think we should continue using CASSANDRA-13701 to track this. TBH I
think in general we should be a bit better at searching for and using
existing tickets...

On Sat, 22 Sep 2018 at 18:13, Stefan Podkowinski  wrote:

> There already have been some discussions on this here:
> https://issues.apache.org/jira/browse/CASSANDRA-13701
>
> The mentioned blocker there on the token allocation shouldn't exist
> anymore. Although it would be good to get more feedback on it, in case
> we want to enable it by default, along with new defaults for number of
> tokens.
>
>
> On 22.09.18 06:30, Dinesh Joshi wrote:
> > Jon, thanks for starting this thread!
> >
> > I have created CASSANDRA-14784 to track this.
> >
> > Dinesh
> >
> >> On Sep 21, 2018, at 9:18 PM, Sankalp Kohli 
> wrote:
> >>
> >> Putting it on JIRA is to make sure someone is assigned to it and it is
> tracked. Changes should be discussed over ML like you are saying.
> >>
> >> On Sep 21, 2018, at 21:02, Jonathan Haddad  wrote:
> >>
>  We should create a JIRA to find what other defaults we need revisit.
> >>> Changing a default is a pretty big deal, I think we should discuss any
> >>> changes to defaults here on the ML before moving it into JIRA.  It's
> nice
> >>> to get a bit more discussion around the change than what happens in
> JIRA.
> >>>
> >>> We (TLP) did some testing on 4 tokens and found it to work surprisingly
> >>> well.   It wasn't particularly formal, but we verified the load stays
> >>> pretty even with only 4 tokens as we added nodes to the cluster.
> Higher
> >>> token count hurts availability by increasing the number of nodes any
> given
> >>> node is a neighbor with, meaning any 2 nodes that fail have an
> increased
> >>> chance of downtime when using QUORUM.  In addition, with the recent
> >>> streaming optimization it seems the token counts will give a greater
> chance
> >>> of a node streaming entire sstables (with LCS), meaning we'll do a
> better
> >>> job with node density out of the box.
> >>>
> >>> Next week I can try to put together something a little more convincing.
> >>> Weekend time.
> >>>
> >>> Jon
> >>>
> >>>
> >>> On Fri, Sep 21, 2018 at 8:45 PM sankalp kohli 
> >>> wrote:
> >>>
>  +1 to lowering it.
>  Thanks Jon for starting this.We should create a JIRA to find what
> other
> 

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-22 Thread Stefan Podkowinski
There already have been some discussions on this here:
https://issues.apache.org/jira/browse/CASSANDRA-13701

The mentioned blocker there on the token allocation shouldn't exist
anymore. Although it would be good to get more feedback on it, in case
we want to enable it by default, along with new defaults for number of
tokens.


On 22.09.18 06:30, Dinesh Joshi wrote:
> Jon, thanks for starting this thread!
>
> I have created CASSANDRA-14784 to track this. 
>
> Dinesh
>
>> On Sep 21, 2018, at 9:18 PM, Sankalp Kohli  wrote:
>>
>> Putting it on JIRA is to make sure someone is assigned to it and it is 
>> tracked. Changes should be discussed over ML like you are saying. 
>>
>> On Sep 21, 2018, at 21:02, Jonathan Haddad  wrote:
>>
 We should create a JIRA to find what other defaults we need revisit.
>>> Changing a default is a pretty big deal, I think we should discuss any
>>> changes to defaults here on the ML before moving it into JIRA.  It's nice
>>> to get a bit more discussion around the change than what happens in JIRA.
>>>
>>> We (TLP) did some testing on 4 tokens and found it to work surprisingly
>>> well.   It wasn't particularly formal, but we verified the load stays
>>> pretty even with only 4 tokens as we added nodes to the cluster.  Higher
>>> token count hurts availability by increasing the number of nodes any given
>>> node is a neighbor with, meaning any 2 nodes that fail have an increased
>>> chance of downtime when using QUORUM.  In addition, with the recent
>>> streaming optimization it seems the token counts will give a greater chance
>>> of a node streaming entire sstables (with LCS), meaning we'll do a better
>>> job with node density out of the box.
>>>
>>> Next week I can try to put together something a little more convincing.
>>> Weekend time.
>>>
>>> Jon
>>>
>>>
>>> On Fri, Sep 21, 2018 at 8:45 PM sankalp kohli 
>>> wrote:
>>>
 +1 to lowering it.
 Thanks Jon for starting this.We should create a JIRA to find what other
 defaults we need revisit. (Please keep this discussion for "default token"
 only.  )

> On Fri, Sep 21, 2018 at 8:26 PM Jeff Jirsa  wrote:
>
> Also agree it should be lowered, but definitely not to 1, and probably
> something closer to 32 than 4.
>
> --
> Jeff Jirsa
>
>
>> On Sep 21, 2018, at 8:24 PM, Jeremy Hanna 
> wrote:
>> I agree that it should be lowered. What I’ve seen debated a bit in the
> past is the number but I don’t think anyone thinks that it should remain
> 256.
>>> On Sep 21, 2018, at 7:05 PM, Jonathan Haddad 
 wrote:
>>> One thing that's really, really bothered me for a while is how we
> default
>>> to 256 tokens still.  There's no experienced operator that leaves it
 as
> is
>>> at this point, meaning the only people using 256 are the poor folks
 that
>>> just got started using C*.  I've worked with over a hundred clusters
 in
> the
>>> last couple years, and I think I only worked with one that had lowered
> it
>>> to something else.
>>>
>>> I think it's time we changed the default to 4 (or 8, up for debate).
>>>
>>> To improve the behavior, we need to change a couple other things.  The
>>> allocate_tokens_for_keyspace setting is... odd.  It requires you have
 a
>>> keyspace already created, which doesn't help on new clusters.  What
 I'd
>>> like to do is add a new setting, allocate_tokens_for_rf, and set it to
> 3 by
>>> default.
>>>
>>> To handle clusters that are already using 256 tokens, we could prevent
> the
>>> new node from joining unless a -D flag is set to explicitly allow
>>> imbalanced tokens.
>>>
>>> We've agreed to a trunk freeze, but I feel like this is important
 enough
>>> (and pretty trivial) to do now.  I'd also personally characterize this
> as a
>>> bug fix since 256 is horribly broken when the cluster gets to any
>>> reasonable size, but maybe I'm alone there.
>>>
>>> I honestly can't think of a use case where random tokens is a good
> choice
>>> anymore, so I'd be fine / ecstatic with removing it completely and
>>> requiring either allocate_tokens_for_keyspace (for existing clusters)
>>> or allocate_tokens_for_rf
>>> to be set.
>>>
>>> Thoughts?  Objections?
>>> --
>>> Jon Haddad
>>> http://www.rustyrazorblade.com
>>> twitter: rustyrazorblade
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>
>>>
>>> -- 
>>> Jon Haddad
>>> http://www.rustyrazorblade.com
>>> 

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-21 Thread Dikang Gu
We are using 8 or 16 tokens internally, with the token allocation algorithm
enabled. The range distribution is good for us.

Dikang.

On Fri, Sep 21, 2018 at 9:30 PM Dinesh Joshi 
wrote:

> Jon, thanks for starting this thread!
>
> I have created CASSANDRA-14784 to track this.
>
> Dinesh
>
> > On Sep 21, 2018, at 9:18 PM, Sankalp Kohli 
> wrote:
> >
> > Putting it on JIRA is to make sure someone is assigned to it and it is
> tracked. Changes should be discussed over ML like you are saying.
> >
> > On Sep 21, 2018, at 21:02, Jonathan Haddad  wrote:
> >
> >>> We should create a JIRA to find what other defaults we need revisit.
> >>
> >> Changing a default is a pretty big deal, I think we should discuss any
> >> changes to defaults here on the ML before moving it into JIRA.  It's
> nice
> >> to get a bit more discussion around the change than what happens in
> JIRA.
> >>
> >> We (TLP) did some testing on 4 tokens and found it to work surprisingly
> >> well.   It wasn't particularly formal, but we verified the load stays
> >> pretty even with only 4 tokens as we added nodes to the cluster.  Higher
> >> token count hurts availability by increasing the number of nodes any
> given
> >> node is a neighbor with, meaning any 2 nodes that fail have an increased
> >> chance of downtime when using QUORUM.  In addition, with the recent
> >> streaming optimization it seems the token counts will give a greater
> chance
> >> of a node streaming entire sstables (with LCS), meaning we'll do a
> better
> >> job with node density out of the box.
> >>
> >> Next week I can try to put together something a little more convincing.
> >> Weekend time.
> >>
> >> Jon
> >>
> >>
> >> On Fri, Sep 21, 2018 at 8:45 PM sankalp kohli 
> >> wrote:
> >>
> >>> +1 to lowering it.
> >>> Thanks Jon for starting this.We should create a JIRA to find what other
> >>> defaults we need revisit. (Please keep this discussion for "default
> token"
> >>> only.  )
> >>>
>  On Fri, Sep 21, 2018 at 8:26 PM Jeff Jirsa  wrote:
> 
>  Also agree it should be lowered, but definitely not to 1, and probably
>  something closer to 32 than 4.
> 
>  --
>  Jeff Jirsa
> 
> 
> > On Sep 21, 2018, at 8:24 PM, Jeremy Hanna <
> jeremy.hanna1...@gmail.com>
>  wrote:
> >
> > I agree that it should be lowered. What I’ve seen debated a bit in
> the
>  past is the number but I don’t think anyone thinks that it should
> remain
>  256.
> >
> >> On Sep 21, 2018, at 7:05 PM, Jonathan Haddad 
> >>> wrote:
> >>
> >> One thing that's really, really bothered me for a while is how we
>  default
> >> to 256 tokens still.  There's no experienced operator that leaves it
> >>> as
>  is
> >> at this point, meaning the only people using 256 are the poor folks
> >>> that
> >> just got started using C*.  I've worked with over a hundred clusters
> >>> in
>  the
> >> last couple years, and I think I only worked with one that had
> lowered
>  it
> >> to something else.
> >>
> >> I think it's time we changed the default to 4 (or 8, up for debate).
> >>
> >> To improve the behavior, we need to change a couple other things.
> The
> >> allocate_tokens_for_keyspace setting is... odd.  It requires you
> have
> >>> a
> >> keyspace already created, which doesn't help on new clusters.  What
> >>> I'd
> >> like to do is add a new setting, allocate_tokens_for_rf, and set it
> to
>  3 by
> >> default.
> >>
> >> To handle clusters that are already using 256 tokens, we could
> prevent
>  the
> >> new node from joining unless a -D flag is set to explicitly allow
> >> imbalanced tokens.
> >>
> >> We've agreed to a trunk freeze, but I feel like this is important
> >>> enough
> >> (and pretty trivial) to do now.  I'd also personally characterize
> this
>  as a
> >> bug fix since 256 is horribly broken when the cluster gets to any
> >> reasonable size, but maybe I'm alone there.
> >>
> >> I honestly can't think of a use case where random tokens is a good
>  choice
> >> anymore, so I'd be fine / ecstatic with removing it completely and
> >> requiring either allocate_tokens_for_keyspace (for existing
> clusters)
> >> or allocate_tokens_for_rf
> >> to be set.
> >>
> >> Thoughts?  Objections?
> >> --
> >> Jon Haddad
> >> http://www.rustyrazorblade.com
> >> twitter: rustyrazorblade
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> 
>  -
>  To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>  For additional commands, e-mail: dev-h...@cassandra.apache.org
> 
> 
> >>>
> >>
> >>
> >> --
> >> Jon Haddad
> 

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-21 Thread Dinesh Joshi
Jon, thanks for starting this thread!

I have created CASSANDRA-14784 to track this. 

Dinesh

> On Sep 21, 2018, at 9:18 PM, Sankalp Kohli  wrote:
> 
> Putting it on JIRA is to make sure someone is assigned to it and it is 
> tracked. Changes should be discussed over ML like you are saying. 
> 
> On Sep 21, 2018, at 21:02, Jonathan Haddad  wrote:
> 
>>> We should create a JIRA to find what other defaults we need revisit.
>> 
>> Changing a default is a pretty big deal, I think we should discuss any
>> changes to defaults here on the ML before moving it into JIRA.  It's nice
>> to get a bit more discussion around the change than what happens in JIRA.
>> 
>> We (TLP) did some testing on 4 tokens and found it to work surprisingly
>> well.   It wasn't particularly formal, but we verified the load stays
>> pretty even with only 4 tokens as we added nodes to the cluster.  Higher
>> token count hurts availability by increasing the number of nodes any given
>> node is a neighbor with, meaning any 2 nodes that fail have an increased
>> chance of downtime when using QUORUM.  In addition, with the recent
>> streaming optimization it seems the token counts will give a greater chance
>> of a node streaming entire sstables (with LCS), meaning we'll do a better
>> job with node density out of the box.
>> 
>> Next week I can try to put together something a little more convincing.
>> Weekend time.
>> 
>> Jon
>> 
>> 
>> On Fri, Sep 21, 2018 at 8:45 PM sankalp kohli 
>> wrote:
>> 
>>> +1 to lowering it.
>>> Thanks Jon for starting this.We should create a JIRA to find what other
>>> defaults we need revisit. (Please keep this discussion for "default token"
>>> only.  )
>>> 
 On Fri, Sep 21, 2018 at 8:26 PM Jeff Jirsa  wrote:
 
 Also agree it should be lowered, but definitely not to 1, and probably
 something closer to 32 than 4.
 
 --
 Jeff Jirsa
 
 
> On Sep 21, 2018, at 8:24 PM, Jeremy Hanna 
 wrote:
> 
> I agree that it should be lowered. What I’ve seen debated a bit in the
 past is the number but I don’t think anyone thinks that it should remain
 256.
> 
>> On Sep 21, 2018, at 7:05 PM, Jonathan Haddad 
>>> wrote:
>> 
>> One thing that's really, really bothered me for a while is how we
 default
>> to 256 tokens still.  There's no experienced operator that leaves it
>>> as
 is
>> at this point, meaning the only people using 256 are the poor folks
>>> that
>> just got started using C*.  I've worked with over a hundred clusters
>>> in
 the
>> last couple years, and I think I only worked with one that had lowered
 it
>> to something else.
>> 
>> I think it's time we changed the default to 4 (or 8, up for debate).
>> 
>> To improve the behavior, we need to change a couple other things.  The
>> allocate_tokens_for_keyspace setting is... odd.  It requires you have
>>> a
>> keyspace already created, which doesn't help on new clusters.  What
>>> I'd
>> like to do is add a new setting, allocate_tokens_for_rf, and set it to
 3 by
>> default.
>> 
>> To handle clusters that are already using 256 tokens, we could prevent
 the
>> new node from joining unless a -D flag is set to explicitly allow
>> imbalanced tokens.
>> 
>> We've agreed to a trunk freeze, but I feel like this is important
>>> enough
>> (and pretty trivial) to do now.  I'd also personally characterize this
 as a
>> bug fix since 256 is horribly broken when the cluster gets to any
>> reasonable size, but maybe I'm alone there.
>> 
>> I honestly can't think of a use case where random tokens is a good
 choice
>> anymore, so I'd be fine / ecstatic with removing it completely and
>> requiring either allocate_tokens_for_keyspace (for existing clusters)
>> or allocate_tokens_for_rf
>> to be set.
>> 
>> Thoughts?  Objections?
>> --
>> Jon Haddad
>> http://www.rustyrazorblade.com
>> twitter: rustyrazorblade
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
 For additional commands, e-mail: dev-h...@cassandra.apache.org
 
 
>>> 
>> 
>> 
>> -- 
>> Jon Haddad
>> http://www.rustyrazorblade.com
>> twitter: rustyrazorblade
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] changing default token behavior for 4.0

2018-09-21 Thread Sankalp Kohli
Putting it on JIRA is to make sure someone is assigned to it and it is tracked. 
Changes should be discussed over ML like you are saying. 

On Sep 21, 2018, at 21:02, Jonathan Haddad  wrote:

>> We should create a JIRA to find what other defaults we need revisit.
> 
> Changing a default is a pretty big deal, I think we should discuss any
> changes to defaults here on the ML before moving it into JIRA.  It's nice
> to get a bit more discussion around the change than what happens in JIRA.
> 
> We (TLP) did some testing on 4 tokens and found it to work surprisingly
> well.   It wasn't particularly formal, but we verified the load stays
> pretty even with only 4 tokens as we added nodes to the cluster.  Higher
> token count hurts availability by increasing the number of nodes any given
> node is a neighbor with, meaning any 2 nodes that fail have an increased
> chance of downtime when using QUORUM.  In addition, with the recent
> streaming optimization it seems the token counts will give a greater chance
> of a node streaming entire sstables (with LCS), meaning we'll do a better
> job with node density out of the box.
> 
> Next week I can try to put together something a little more convincing.
> Weekend time.
> 
> Jon
> 
> 
> On Fri, Sep 21, 2018 at 8:45 PM sankalp kohli 
> wrote:
> 
>> +1 to lowering it.
>> Thanks Jon for starting this.We should create a JIRA to find what other
>> defaults we need revisit. (Please keep this discussion for "default token"
>> only.  )
>> 
>>> On Fri, Sep 21, 2018 at 8:26 PM Jeff Jirsa  wrote:
>>> 
>>> Also agree it should be lowered, but definitely not to 1, and probably
>>> something closer to 32 than 4.
>>> 
>>> --
>>> Jeff Jirsa
>>> 
>>> 
 On Sep 21, 2018, at 8:24 PM, Jeremy Hanna 
>>> wrote:
 
 I agree that it should be lowered. What I’ve seen debated a bit in the
>>> past is the number but I don’t think anyone thinks that it should remain
>>> 256.
 
> On Sep 21, 2018, at 7:05 PM, Jonathan Haddad 
>> wrote:
> 
> One thing that's really, really bothered me for a while is how we
>>> default
> to 256 tokens still.  There's no experienced operator that leaves it
>> as
>>> is
> at this point, meaning the only people using 256 are the poor folks
>> that
> just got started using C*.  I've worked with over a hundred clusters
>> in
>>> the
> last couple years, and I think I only worked with one that had lowered
>>> it
> to something else.
> 
> I think it's time we changed the default to 4 (or 8, up for debate).
> 
> To improve the behavior, we need to change a couple other things.  The
> allocate_tokens_for_keyspace setting is... odd.  It requires you have
>> a
> keyspace already created, which doesn't help on new clusters.  What
>> I'd
> like to do is add a new setting, allocate_tokens_for_rf, and set it to
>>> 3 by
> default.
> 
> To handle clusters that are already using 256 tokens, we could prevent
>>> the
> new node from joining unless a -D flag is set to explicitly allow
> imbalanced tokens.
> 
> We've agreed to a trunk freeze, but I feel like this is important
>> enough
> (and pretty trivial) to do now.  I'd also personally characterize this
>>> as a
> bug fix since 256 is horribly broken when the cluster gets to any
> reasonable size, but maybe I'm alone there.
> 
> I honestly can't think of a use case where random tokens is a good
>>> choice
> anymore, so I'd be fine / ecstatic with removing it completely and
> requiring either allocate_tokens_for_keyspace (for existing clusters)
> or allocate_tokens_for_rf
> to be set.
> 
> Thoughts?  Objections?
> --
> Jon Haddad
> http://www.rustyrazorblade.com
> twitter: rustyrazorblade
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
 For additional commands, e-mail: dev-h...@cassandra.apache.org
 
>>> 
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>> 
>>> 
>> 
> 
> 
> -- 
> Jon Haddad
> http://www.rustyrazorblade.com
> twitter: rustyrazorblade

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] changing default token behavior for 4.0

2018-09-21 Thread Jonathan Haddad
> We should create a JIRA to find what other defaults we need revisit.

Changing a default is a pretty big deal, I think we should discuss any
changes to defaults here on the ML before moving it into JIRA.  It's nice
to get a bit more discussion around the change than what happens in JIRA.

We (TLP) did some testing on 4 tokens and found it to work surprisingly
well.   It wasn't particularly formal, but we verified the load stays
pretty even with only 4 tokens as we added nodes to the cluster.  Higher
token count hurts availability by increasing the number of nodes any given
node is a neighbor with, meaning any 2 nodes that fail have an increased
chance of downtime when using QUORUM.  In addition, with the recent
streaming optimization it seems the token counts will give a greater chance
of a node streaming entire sstables (with LCS), meaning we'll do a better
job with node density out of the box.

Next week I can try to put together something a little more convincing.
Weekend time.

Jon


On Fri, Sep 21, 2018 at 8:45 PM sankalp kohli 
wrote:

> +1 to lowering it.
> Thanks Jon for starting this.We should create a JIRA to find what other
> defaults we need revisit. (Please keep this discussion for "default token"
> only.  )
>
> On Fri, Sep 21, 2018 at 8:26 PM Jeff Jirsa  wrote:
>
> > Also agree it should be lowered, but definitely not to 1, and probably
> > something closer to 32 than 4.
> >
> > --
> > Jeff Jirsa
> >
> >
> > > On Sep 21, 2018, at 8:24 PM, Jeremy Hanna 
> > wrote:
> > >
> > > I agree that it should be lowered. What I’ve seen debated a bit in the
> > past is the number but I don’t think anyone thinks that it should remain
> > 256.
> > >
> > >> On Sep 21, 2018, at 7:05 PM, Jonathan Haddad 
> wrote:
> > >>
> > >> One thing that's really, really bothered me for a while is how we
> > default
> > >> to 256 tokens still.  There's no experienced operator that leaves it
> as
> > is
> > >> at this point, meaning the only people using 256 are the poor folks
> that
> > >> just got started using C*.  I've worked with over a hundred clusters
> in
> > the
> > >> last couple years, and I think I only worked with one that had lowered
> > it
> > >> to something else.
> > >>
> > >> I think it's time we changed the default to 4 (or 8, up for debate).
> > >>
> > >> To improve the behavior, we need to change a couple other things.  The
> > >> allocate_tokens_for_keyspace setting is... odd.  It requires you have
> a
> > >> keyspace already created, which doesn't help on new clusters.  What
> I'd
> > >> like to do is add a new setting, allocate_tokens_for_rf, and set it to
> > 3 by
> > >> default.
> > >>
> > >> To handle clusters that are already using 256 tokens, we could prevent
> > the
> > >> new node from joining unless a -D flag is set to explicitly allow
> > >> imbalanced tokens.
> > >>
> > >> We've agreed to a trunk freeze, but I feel like this is important
> enough
> > >> (and pretty trivial) to do now.  I'd also personally characterize this
> > as a
> > >> bug fix since 256 is horribly broken when the cluster gets to any
> > >> reasonable size, but maybe I'm alone there.
> > >>
> > >> I honestly can't think of a use case where random tokens is a good
> > choice
> > >> anymore, so I'd be fine / ecstatic with removing it completely and
> > >> requiring either allocate_tokens_for_keyspace (for existing clusters)
> > >> or allocate_tokens_for_rf
> > >> to be set.
> > >>
> > >> Thoughts?  Objections?
> > >> --
> > >> Jon Haddad
> > >> http://www.rustyrazorblade.com
> > >> twitter: rustyrazorblade
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>


-- 
Jon Haddad
http://www.rustyrazorblade.com
twitter: rustyrazorblade


Re: [DISCUSS] changing default token behavior for 4.0

2018-09-21 Thread sankalp kohli
+1 to lowering it.
Thanks Jon for starting this.We should create a JIRA to find what other
defaults we need revisit. (Please keep this discussion for "default token"
only.  )

On Fri, Sep 21, 2018 at 8:26 PM Jeff Jirsa  wrote:

> Also agree it should be lowered, but definitely not to 1, and probably
> something closer to 32 than 4.
>
> --
> Jeff Jirsa
>
>
> > On Sep 21, 2018, at 8:24 PM, Jeremy Hanna 
> wrote:
> >
> > I agree that it should be lowered. What I’ve seen debated a bit in the
> past is the number but I don’t think anyone thinks that it should remain
> 256.
> >
> >> On Sep 21, 2018, at 7:05 PM, Jonathan Haddad  wrote:
> >>
> >> One thing that's really, really bothered me for a while is how we
> default
> >> to 256 tokens still.  There's no experienced operator that leaves it as
> is
> >> at this point, meaning the only people using 256 are the poor folks that
> >> just got started using C*.  I've worked with over a hundred clusters in
> the
> >> last couple years, and I think I only worked with one that had lowered
> it
> >> to something else.
> >>
> >> I think it's time we changed the default to 4 (or 8, up for debate).
> >>
> >> To improve the behavior, we need to change a couple other things.  The
> >> allocate_tokens_for_keyspace setting is... odd.  It requires you have a
> >> keyspace already created, which doesn't help on new clusters.  What I'd
> >> like to do is add a new setting, allocate_tokens_for_rf, and set it to
> 3 by
> >> default.
> >>
> >> To handle clusters that are already using 256 tokens, we could prevent
> the
> >> new node from joining unless a -D flag is set to explicitly allow
> >> imbalanced tokens.
> >>
> >> We've agreed to a trunk freeze, but I feel like this is important enough
> >> (and pretty trivial) to do now.  I'd also personally characterize this
> as a
> >> bug fix since 256 is horribly broken when the cluster gets to any
> >> reasonable size, but maybe I'm alone there.
> >>
> >> I honestly can't think of a use case where random tokens is a good
> choice
> >> anymore, so I'd be fine / ecstatic with removing it completely and
> >> requiring either allocate_tokens_for_keyspace (for existing clusters)
> >> or allocate_tokens_for_rf
> >> to be set.
> >>
> >> Thoughts?  Objections?
> >> --
> >> Jon Haddad
> >> http://www.rustyrazorblade.com
> >> twitter: rustyrazorblade
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: [DISCUSS] changing default token behavior for 4.0

2018-09-21 Thread Jeff Jirsa
Also agree it should be lowered, but definitely not to 1, and probably 
something closer to 32 than 4.

-- 
Jeff Jirsa


> On Sep 21, 2018, at 8:24 PM, Jeremy Hanna  wrote:
> 
> I agree that it should be lowered. What I’ve seen debated a bit in the past 
> is the number but I don’t think anyone thinks that it should remain 256.
> 
>> On Sep 21, 2018, at 7:05 PM, Jonathan Haddad  wrote:
>> 
>> One thing that's really, really bothered me for a while is how we default
>> to 256 tokens still.  There's no experienced operator that leaves it as is
>> at this point, meaning the only people using 256 are the poor folks that
>> just got started using C*.  I've worked with over a hundred clusters in the
>> last couple years, and I think I only worked with one that had lowered it
>> to something else.
>> 
>> I think it's time we changed the default to 4 (or 8, up for debate).
>> 
>> To improve the behavior, we need to change a couple other things.  The
>> allocate_tokens_for_keyspace setting is... odd.  It requires you have a
>> keyspace already created, which doesn't help on new clusters.  What I'd
>> like to do is add a new setting, allocate_tokens_for_rf, and set it to 3 by
>> default.
>> 
>> To handle clusters that are already using 256 tokens, we could prevent the
>> new node from joining unless a -D flag is set to explicitly allow
>> imbalanced tokens.
>> 
>> We've agreed to a trunk freeze, but I feel like this is important enough
>> (and pretty trivial) to do now.  I'd also personally characterize this as a
>> bug fix since 256 is horribly broken when the cluster gets to any
>> reasonable size, but maybe I'm alone there.
>> 
>> I honestly can't think of a use case where random tokens is a good choice
>> anymore, so I'd be fine / ecstatic with removing it completely and
>> requiring either allocate_tokens_for_keyspace (for existing clusters)
>> or allocate_tokens_for_rf
>> to be set.
>> 
>> Thoughts?  Objections?
>> -- 
>> Jon Haddad
>> http://www.rustyrazorblade.com
>> twitter: rustyrazorblade
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] changing default token behavior for 4.0

2018-09-21 Thread Jeremy Hanna
I agree that it should be lowered. What I’ve seen debated a bit in the past is 
the number but I don’t think anyone thinks that it should remain 256.

> On Sep 21, 2018, at 7:05 PM, Jonathan Haddad  wrote:
> 
> One thing that's really, really bothered me for a while is how we default
> to 256 tokens still.  There's no experienced operator that leaves it as is
> at this point, meaning the only people using 256 are the poor folks that
> just got started using C*.  I've worked with over a hundred clusters in the
> last couple years, and I think I only worked with one that had lowered it
> to something else.
> 
> I think it's time we changed the default to 4 (or 8, up for debate).
> 
> To improve the behavior, we need to change a couple other things.  The
> allocate_tokens_for_keyspace setting is... odd.  It requires you have a
> keyspace already created, which doesn't help on new clusters.  What I'd
> like to do is add a new setting, allocate_tokens_for_rf, and set it to 3 by
> default.
> 
> To handle clusters that are already using 256 tokens, we could prevent the
> new node from joining unless a -D flag is set to explicitly allow
> imbalanced tokens.
> 
> We've agreed to a trunk freeze, but I feel like this is important enough
> (and pretty trivial) to do now.  I'd also personally characterize this as a
> bug fix since 256 is horribly broken when the cluster gets to any
> reasonable size, but maybe I'm alone there.
> 
> I honestly can't think of a use case where random tokens is a good choice
> anymore, so I'd be fine / ecstatic with removing it completely and
> requiring either allocate_tokens_for_keyspace (for existing clusters)
> or allocate_tokens_for_rf
> to be set.
> 
> Thoughts?  Objections?
> -- 
> Jon Haddad
> http://www.rustyrazorblade.com
> twitter: rustyrazorblade

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] changing default token behavior for 4.0

2018-09-21 Thread dinesh.jo...@yahoo.com.INVALID
Logistics aside, I think it is a good idea to default 1 token (or a low 
number). Let the user understand what it means to go beyond 1 and tune things 
based on their needs.
Dinesh 

On Friday, September 21, 2018, 5:06:14 PM PDT, Jonathan Haddad 
 wrote:  
 
 One thing that's really, really bothered me for a while is how we default
to 256 tokens still.  There's no experienced operator that leaves it as is
at this point, meaning the only people using 256 are the poor folks that
just got started using C*.  I've worked with over a hundred clusters in the
last couple years, and I think I only worked with one that had lowered it
to something else.

I think it's time we changed the default to 4 (or 8, up for debate).

To improve the behavior, we need to change a couple other things.  The
allocate_tokens_for_keyspace setting is... odd.  It requires you have a
keyspace already created, which doesn't help on new clusters.  What I'd
like to do is add a new setting, allocate_tokens_for_rf, and set it to 3 by
default.

To handle clusters that are already using 256 tokens, we could prevent the
new node from joining unless a -D flag is set to explicitly allow
imbalanced tokens.

We've agreed to a trunk freeze, but I feel like this is important enough
(and pretty trivial) to do now.  I'd also personally characterize this as a
bug fix since 256 is horribly broken when the cluster gets to any
reasonable size, but maybe I'm alone there.

I honestly can't think of a use case where random tokens is a good choice
anymore, so I'd be fine / ecstatic with removing it completely and
requiring either allocate_tokens_for_keyspace (for existing clusters)
or allocate_tokens_for_rf
to be set.

Thoughts?  Objections?
-- 
Jon Haddad
http://www.rustyrazorblade.com
twitter: rustyrazorblade
  

[DISCUSS] changing default token behavior for 4.0

2018-09-21 Thread Jonathan Haddad
One thing that's really, really bothered me for a while is how we default
to 256 tokens still.  There's no experienced operator that leaves it as is
at this point, meaning the only people using 256 are the poor folks that
just got started using C*.  I've worked with over a hundred clusters in the
last couple years, and I think I only worked with one that had lowered it
to something else.

I think it's time we changed the default to 4 (or 8, up for debate).

To improve the behavior, we need to change a couple other things.  The
allocate_tokens_for_keyspace setting is... odd.  It requires you have a
keyspace already created, which doesn't help on new clusters.  What I'd
like to do is add a new setting, allocate_tokens_for_rf, and set it to 3 by
default.

To handle clusters that are already using 256 tokens, we could prevent the
new node from joining unless a -D flag is set to explicitly allow
imbalanced tokens.

We've agreed to a trunk freeze, but I feel like this is important enough
(and pretty trivial) to do now.  I'd also personally characterize this as a
bug fix since 256 is horribly broken when the cluster gets to any
reasonable size, but maybe I'm alone there.

I honestly can't think of a use case where random tokens is a good choice
anymore, so I'd be fine / ecstatic with removing it completely and
requiring either allocate_tokens_for_keyspace (for existing clusters)
or allocate_tokens_for_rf
to be set.

Thoughts?  Objections?
-- 
Jon Haddad
http://www.rustyrazorblade.com
twitter: rustyrazorblade