>> it will take half of G's range. Problem is, it will take
>> half of G's _primary_ range, but most of G's load comes from
>> _replicas_.

> From looking at the code recently, it chooses a token that splits G's load by 
> actually sampling the data stored on G, which should make the primary vs 
> replica point moot.

Yeah, missed that change. The comments in getSplits do not reflect
that change :)

However, this does not remove the issue that bootstrapping node should
consider datacenter (and possibly rack) issues, I think. If we return
to the original example and consider the difference depending on if X
is from DC1 or DC2. Here's the original ring structure:

A: H-A, F-G
B: A-B, H-A
C: B-C, H-A, A-B
D: C-D, B-C
E: D-E, C-D
F: E-F, D-E
G: F-G, A-B, B-C, C-D, D-E, E-F
H: G-H, E-F, F-G

If X is from the same DC1 as B and G, things are OK, as it will boot
in the middle of B and G. However, if X is added to DC2, it will also
boot in the middle of B and G. This will do nothing to balance the
load, so if there are multiple nodes bootstrapping, all of them will
go to the same middle region. What they probably should do, is to just
consider nodes in the DC they are booting to, and try to balance load
evenly in that DC. If the other DC does the same, overall load should
be well balanced.

-Jaakko

Reply via email to