> it will migrate you to virtual nodes by splitting the existing partition > 256 ways.
Out of curiosity, is it for the purpose of avoiding streaming? the former would require you to perform a shuffle to achieve that. Is there a nodetool option or are there other ways "shuffle" could be done automatically? On Thu, Nov 1, 2012 at 2:17 AM, Eric Evans <eev...@acunu.com> wrote: > On Wed, Oct 31, 2012 at 11:38 AM, John Sanda <john.sa...@gmail.com> wrote: > > Can/should i assume that i will get even range distribution or close to > it with random > > token selection? > > The short answer is: If you're using virtual nodes, random token > selection will give you even range distribution. > > The somewhat longer answer is that this is really a function of the > total number of tokens. The more randomly generated tokens a cluster > has, the more distribution will even out. The reason this can work > for virtual nodes where it has not for the older 1-token-per-node > model is because (assuming a reasonable num_tokens value), virtual > nodes gives you a much higher token count for a given number of nodes. > > That wiki page you cite wasn't really intended to be documentation > (expect some of that soon though), but what that section was trying to > convey was that while random distribution is quite good, it may not be > 100% perfect, especially when the number of nodes is low (remember, > the number of tokens scales with the number of nodes). I think this > is (or may be) a problem for some. If you're forced to manually > calculate tokens then you are quite naturally going to calculate a > perfect distribution, and if you've grown accustomed to this, seeing > the ownership values off by a few percent could really bring out your > inner OCD. :) > > > For the sake of discussion, what is a reasonable default to start > > with for num_tokens assuming nodes are homogenous? That wiki page > mentions a > > default of 256 which I see commented out in cassandra.yaml; however, > > Config.num_tokens is set to 1. > > The (unconfigured )default is 1. That is to say that virtual nodes is > not enabled. The current recommendation when setting this, > (documented in the config) is 256. > > > Maybe I missed where the default of 256 is > > used. From some initial testing though, it looks like 1 token per node is > > being used. Using defaults in cassandra.yaml, I see this in my logs, > > Right. And it's worth noting that if you uncomment num_tokens *after* > starting a node with it commented (i.e. num_tokens: 1), then it will > migrate you to virtual nodes by splitting the existing partition 256 > ways. This is *not* the equivalent of starting a node with num_tokens > = 256 for the first time. The latter would leave you with randomized > placement, the former would require you to perform a shuffle to > achieve that. > > > > -- > Eric Evans > Acunu | http://www.acunu.com | @acunu >