Re: [DISCUSS] changing default token behavior for 4.0

2018-09-25 Thread kurt greaves
This was exactly the kind of problem I was foreseeing. I don't see any simple way of fixing it without introducing some shuffle-like nightmare that does a whole bunch of token movements though. On the other hand we could just document best practice, and also make it so that by default you have to

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-25 Thread kurt greaves
Sounds good to me. I'm going to play around with the algorithm and actually record some numbers/evidence over the next week to help us decide. On Tue, 25 Sep 2018 at 05:38, Joseph Lynch wrote: > I am a big fan of lowering the default number of tokens for many > reasons (availability, repair,

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-24 Thread Benedict Elliott Smith
This sounds worthy of a bug report! We should at least document any such inadequacy, and come up with a plan to fix it. It would be great if you could file a ticket with a detailed example of the problem. > On 24 Sep 2018, at 14:57, Tom van der Woerdt > wrote: > > Late comment, but I'll

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-24 Thread Joseph Lynch
I am a big fan of lowering the default number of tokens for many reasons (availability, repair, etc...). I also agree there are some usability blockers to "just lowering the number today", but I very much agree that the current default of 256 random tokens is a huge bug I hope we fix by 4.0

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-24 Thread Tom van der Woerdt
Late comment, but I'll write it anyway. The main advantage of random allocation over the new allocation strategy is that it seems to be significantly better when dealing with node *removals*, when the order of removal is not the inverse of the order of addition. This can lead to severely

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-23 Thread Nate McCall
Let's pick a default setup that works for most people (IME clusters < 30 nodes, but TLP and Instaclustr peeps probably have the most insight here). Then we just explain the heck out of it in the comments. I would also like to see this include some details add/remove a DC to change the values

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-22 Thread kurt greaves
Only that it makes it easier to spin up a cluster. I'm for removing it entirely as well, however I think we should keep it around at least until the next major just as a safety precaution until the algorithm is properly battle tested. This is not a strongly held opinion though, I'm just

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-22 Thread Jonathan Haddad
Is there a use case for random allocation? How does it help with testing? I can’t see a reason to keep it around. On Sat, Sep 22, 2018 at 3:06 AM kurt greaves wrote: > +1. I've been making a case for this for some time now, and was actually a > focus of my talk last week. I'd be very happy to

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-22 Thread kurt greaves
+1. I've been making a case for this for some time now, and was actually a focus of my talk last week. I'd be very happy to get this into 4.0. We've tested various num_tokens with the algorithm on various sized clusters and we've found that typically 16 works best. With lower numbers we found

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-22 Thread Stefan Podkowinski
There already have been some discussions on this here: https://issues.apache.org/jira/browse/CASSANDRA-13701 The mentioned blocker there on the token allocation shouldn't exist anymore. Although it would be good to get more feedback on it, in case we want to enable it by default, along with new

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-21 Thread Dikang Gu
We are using 8 or 16 tokens internally, with the token allocation algorithm enabled. The range distribution is good for us. Dikang. On Fri, Sep 21, 2018 at 9:30 PM Dinesh Joshi wrote: > Jon, thanks for starting this thread! > > I have created CASSANDRA-14784 to track this. > > Dinesh > > > On

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-21 Thread Dinesh Joshi
Jon, thanks for starting this thread! I have created CASSANDRA-14784 to track this. Dinesh > On Sep 21, 2018, at 9:18 PM, Sankalp Kohli wrote: > > Putting it on JIRA is to make sure someone is assigned to it and it is > tracked. Changes should be discussed over ML like you are saying. > >

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-21 Thread Sankalp Kohli
Putting it on JIRA is to make sure someone is assigned to it and it is tracked. Changes should be discussed over ML like you are saying. On Sep 21, 2018, at 21:02, Jonathan Haddad wrote: >> We should create a JIRA to find what other defaults we need revisit. > > Changing a default is a

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-21 Thread Jonathan Haddad
> We should create a JIRA to find what other defaults we need revisit. Changing a default is a pretty big deal, I think we should discuss any changes to defaults here on the ML before moving it into JIRA. It's nice to get a bit more discussion around the change than what happens in JIRA. We

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-21 Thread sankalp kohli
+1 to lowering it. Thanks Jon for starting this.We should create a JIRA to find what other defaults we need revisit. (Please keep this discussion for "default token" only. ) On Fri, Sep 21, 2018 at 8:26 PM Jeff Jirsa wrote: > Also agree it should be lowered, but definitely not to 1, and

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-21 Thread Jeff Jirsa
Also agree it should be lowered, but definitely not to 1, and probably something closer to 32 than 4. -- Jeff Jirsa > On Sep 21, 2018, at 8:24 PM, Jeremy Hanna wrote: > > I agree that it should be lowered. What I’ve seen debated a bit in the past > is the number but I don’t think anyone

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-21 Thread Jeremy Hanna
I agree that it should be lowered. What I’ve seen debated a bit in the past is the number but I don’t think anyone thinks that it should remain 256. > On Sep 21, 2018, at 7:05 PM, Jonathan Haddad wrote: > > One thing that's really, really bothered me for a while is how we default > to 256

Re: [DISCUSS] changing default token behavior for 4.0

2018-09-21 Thread dinesh.jo...@yahoo.com.INVALID
Logistics aside, I think it is a good idea to default 1 token (or a low number). Let the user understand what it means to go beyond 1 and tune things based on their needs. Dinesh On Friday, September 21, 2018, 5:06:14 PM PDT, Jonathan Haddad wrote: One thing that's really, really

[DISCUSS] changing default token behavior for 4.0

2018-09-21 Thread Jonathan Haddad
One thing that's really, really bothered me for a while is how we default to 256 tokens still. There's no experienced operator that leaves it as is at this point, meaning the only people using 256 are the poor folks that just got started using C*. I've worked with over a hundred clusters in the