yes
On Thu, Oct 1, 2009 at 12:49 PM, Igor Katkov <[email protected]> wrote: > I see, so to make cluster always balanced (data-wise) number of nodes should > be doubled each time. > I see some activity in JIAR regarding load-balancing for v.0.5 > Does it target the same thing? transferring data from node to node and > appropriately modifying tokens? > > On Thu, Oct 1, 2009 at 1:42 PM, Jonathan Ellis <[email protected]> wrote: >> >> You basically have two options. You can wipe your data, change the >> tokens, and reload things, or you can add new nodes with -b to >> rebalance things that way. >> >> On Thu, Oct 1, 2009 at 12:34 PM, Igor Katkov <[email protected]> wrote: >> > OK, so I don't need to use tokenupdater, what are the steps to rebalance >> > data around the circle? >> > >> > In my test example (see below), I have A, D, B and C (clockwise) where >> > A holds 1/3 of the data >> > D - 1/6 >> > B - 1/6 >> > C - 1/3 >> > I'm willing to change tokens manually, it's all right. >> > How do I tell all nodes to move data around in version 0.4? Do I change >> > token on node A and restart it with -b? Then same for the rest? >> > restarting >> > only one node at a time? >> > >> > >> > >> > On Thu, Oct 1, 2009 at 1:22 PM, Jonathan Ellis <[email protected]> >> > wrote: >> >> >> >> tokenupdater does not move data around; it's just an alternative to >> >> setting <initialtoken> on each node. so you really want to get your >> >> tokens right for your initial set of nodes before adding data. >> >> >> >> we're finishing up full load balancing for 0.5 but even then it's best >> >> to start with a reasonable distribution instead of starting with >> >> random and forcing the balancer to move things around a bunch. >> >> >> >> On Thu, Oct 1, 2009 at 12:14 PM, Igor Katkov <[email protected]> wrote: >> >> > What is the correct procedure for data re-partitioning? >> >> > Suppose I have 3 nodes - "A", "B", "C" >> >> > tokens on the ring: >> >> > A: 0 >> >> > B: 2.8356863910078205288614550619314e+37 >> >> > C: 5.6713727820156410577229101238628e+37 >> >> > >> >> > Then I add node "D", token: 1.4178431955039102644307275309655e+37 >> >> > (B/2) >> >> > Start node "D" with -b >> >> > Wait >> >> > Run nodeprobe -host hostB ... cleanup on live "B" >> >> > Wait >> >> > Done >> >> > >> >> > Now data is not evenly balanced because tokens are not evenly spaced. >> >> > I >> >> > see >> >> > that there is tokenupdater (org.apache.cassandra.tools.TokenUpdater) >> >> > What happens with keys and data if I run it on "A", "B", "C" and "D" >> >> > with >> >> > new, better spaced tokens? Should I? is there a better procedure? >> >> > >> >> > >> >> > >> >> > >> >> > On Thu, Oct 1, 2009 at 12:48 PM, Jonathan Ellis <[email protected]> >> >> > wrote: >> >> >> >> >> >> On Thu, Oct 1, 2009 at 11:26 AM, Igor Katkov <[email protected]> >> >> >> wrote: >> >> >> > Hi, >> >> >> > >> >> >> > Question#1: >> >> >> > How to manually select tokens to force equal spacing of tokens >> >> >> > around >> >> >> > the >> >> >> > hash space? >> >> >> >> >> >> (Answered by Jun.) >> >> >> >> >> >> > Question#2: >> >> >> > Let's assume that #1 was resolved somehow and key distribution is >> >> >> > more >> >> >> > or >> >> >> > less even. >> >> >> > A new node "C" joins the cluster. It's token falls somewhere >> >> >> > between >> >> >> > two >> >> >> > other tokens on the ring (from nodes "A" and "B" >> >> >> > clockwise-ordered). >> >> >> > From >> >> >> > now on "C" is responsible for a portion of data that used to >> >> >> > exclusively >> >> >> > belong to "B". >> >> >> > a. Cassandra v.0.4 will not automatically transfer this data to >> >> >> > "C" >> >> >> > will >> >> >> > it? >> >> >> >> >> >> It will, if you start C with the -b ("bootstrap") flag. >> >> >> >> >> >> > b. Do all reads to these keys fail? >> >> >> >> >> >> No. >> >> >> >> >> >> > c. What happens with the data reference by these keys on "B"? It >> >> >> > will >> >> >> > never >> >> >> > be accessed there, therefor it becomes garbage. Since there are to >> >> >> > GC >> >> >> > will >> >> >> > it stick forever? >> >> >> >> >> >> nodeprobe cleanup after the bootstrap completes will instruct B to >> >> >> throw out data that has been copied to C. >> >> >> >> >> >> > d. What happens to replicas of these keys? >> >> >> >> >> >> These are also handled by -b. >> >> >> >> >> >> -Jonathan >> >> > >> >> > >> > >> > > >
