On Wed, Feb 10, 2010 at 12:45 PM, Jonathan Ellis <jbel...@gmail.com> wrote: > On Tue, Feb 9, 2010 at 6:12 PM, Jaakko <rosvopaalli...@gmail.com> wrote: >> Let us suppose that all ranges are equal in size. In this case G's >> range is A-G. If X boots in G's DC, it should take a token in the >> middle of this range, which would be somewhere around D. If X boots >> behind D > > Ah, I see, you are saying, "G has replicas from A-G, so really it > should take a pare of E's range instead of G's."
More like G has replicas from A-G, so X should take half of the replicas :) > That seems reasonable, although it feels a little weird for X to as G > for a token and be given one that G isn't the primary for. Yeah, it is a bit counter intuitive, but if we consider where G's load comes from (replicas), it is natural to try to divide that range into half instead of just considering what G's primary range is. > You're always going to have situations where a simple algorithm does > the "wrong" thing though, which is why we leave the raw move command > exposed. Yes, that is of course true. However, I don't think this modification would make the algorithm much less simple. We still consider the most loaded node only, but take into account which DC the node is in. Without that extra step, loadbalance only works for rack unaware. If we make this change, nothing would change for rack unaware, but for other strategies things would be better, I think. -Jaakko