Re: distributing tokens equally along the key distribution space

Igor Katkov Thu, 01 Oct 2009 10:15:54 -0700

What is the correct procedure for data re-partitioning?
Suppose I have 3 nodes - "A", "B", "C"
tokens on the ring:
A: 0
B: 2.8356863910078205288614550619314e+37
C: 5.6713727820156410577229101238628e+37

Then I add node "D", token: 1.4178431955039102644307275309655e+37 (B/2)
Start node "D" with -b
Wait
Run nodeprobe -host hostB ... cleanup on live "B"
Wait
Done

Now data is not evenly balanced because tokens are not evenly spaced. I see
that there is tokenupdater (org.apache.cassandra.tools.TokenUpdater)
What happens with keys and data if I run it on "A", "B", "C" and "D" with
new, better spaced tokens? Should I? is there a better procedure?

On Thu, Oct 1, 2009 at 12:48 PM, Jonathan Ellis <[email protected]> wrote:

> On Thu, Oct 1, 2009 at 11:26 AM, Igor Katkov <[email protected]> wrote:
> > Hi,
> >
> > Question#1:
> > How to manually select tokens to force equal spacing of tokens around the
> > hash space?
>
> (Answered by Jun.)
>
> > Question#2:
> > Let's assume that #1 was resolved somehow and key distribution is more or
> > less even.
> > A new node "C" joins the cluster. It's token falls somewhere between two
> > other tokens on the ring (from nodes "A" and "B" clockwise-ordered). From
> > now on "C" is responsible for a portion of data that used to exclusively
> > belong to "B".
> > a. Cassandra v.0.4 will not automatically transfer this data to "C" will
> it?
>
> It will, if you start C with the -b ("bootstrap") flag.
>
> > b. Do all reads to these keys fail?
>
> No.
>
> > c. What happens with the data reference by these keys on "B"? It will
> never
> > be accessed there, therefor it becomes garbage. Since there are to GC
> will
> > it stick forever?
>
> nodeprobe cleanup after the bootstrap completes will instruct B to
> throw out data that has been copied to C.
>
> > d. What happens to replicas of these keys?
>
> These are also handled by -b.
>
> -Jonathan
>

Re: distributing tokens equally along the key distribution space

Reply via email to