On Sat, 8 Sep 2018, 19:00 Jeff Jirsa, <jji...@gmail.com> wrote:

> Virtual nodes accomplish two primary goals
>
> 1) it makes it easier to gradually add/remove capacity to your cluster by
> distributing the new host capacity around the ring in smaller increments
>
> 2) it increases the number of sources for streaming, which speeds up
> bootstrap and decommission
>
> Whether or not either of these actually is true depends on a number of
> factors, like your cluster size (for #1) and your replication factor (for
> #2). If you have 4 hosts and 4 tokens per host and add a 5th host, you’ll
> probably add a neighbor near each existing host (#1) and stream from every
> other host (#2), so that’s great. If you have 20 hosts and add a new host
> with 4 tokens, most of your existing ranges won’t change at all - you’re
> nominally adding 5% of your cluster capacity but you won’t see a 5%
> improvement because you don’t have enough tokens to move 5% of your ranges.
> If you had 32 tokens, you’d probably actually see that 5% improvement,
> because you’d likely add a new range near each of the existing ranges.
>

Jeff,

I'm a bit lost here: are you referring to streaming speed improvement or
cluster capacity increase?

Going down to 1 token would mean you’d probably need to manually move
> tokens after each bootstrap to rebalance, which is fine, it just takes more
> operator awareness.
>

Right. This is then the old story before vnodes that you can only scale out
and keep balanced cluster if you double the number of nodes. Or you can
move the tokens.

What's not clear to me is why 4 tokens (as opposed to only 1) should be
enough for adding small number of nodes and keeping the balance.

Assuming we have 3 racks, we would add 3 nodes at a time for scaling out.
With 4 tokens we split only 12 ranges across the ring this way. I would
think it depends on the current cluster size, but empirically the load skew
at first gets worse (for middle-sized clusters) and then probably is
cancelled out for bigger sizes. Did anyone tried to do the actual math for
this?

I don’t know how DSE calculates which replication factor to use for their
> token allocation logic, maybe they guess or take the highest or something.
> Cassandra doesn’t - we require you to be explicit, but we could probably do
> better here.
>

I believe that DSE also doesn't calculate it--you specify the RF to
optimize for in the config. At least their config parameter is called
allocate_tokens_for_local_replication_factor:
https://docs.datastax.com/en/dse/5.1/dse-dev/datastax_enterprise/config/configVnodes.html

That being said, I have never used DSE, hence was my question.

Cheers,
--
Alex

Reply via email to