Re: Experimental patch: Consistent Hashing with Bounded Loads

Willy Tarreau Sun, 11 Sep 2016 10:58:55 -0700

Hi Andrew,

On Sun, Sep 11, 2016 at 01:45:35PM -0400, Andrew Rodland wrote:
> On Sunday, September 11, 2016 9:11:50 AM EDT Willy Tarreau wrote:
> > Don't worry, that's the principle of a PoC. I looked at your code. I think
> > we can significantly simplify it. The current server's relative capacity
> > should be computed on the fly when trying to pick the server in the ring.
> > Right now you call this function chash_update_server_capacities() to
> > recompute all servers capacities before chosing your server but given that
> > you already have the information on the number of total served connections
> > in the backend, it's possible to modify the current lookup function to
> > void running this loop first. This is especially important as some setups
> > have very large numbers of servers (eg: up to 1000).
> 
> I agree that this would be better, but the algorithm is supposed to 
> distribute 
> the capacity among servers in a particular way. Concretely, say that the 
> computed capacity is 14 and there are 10 servers. Then there should be 4 
> servers with a capacity of 2 and 6 servers with a capacity of 1, and which 
> are 
> which shouldn't depend on where the request hashed to ??? there needs to be 
> some 
> way of identifying the "first" 4 servers. Walking the hash ring seems like an 
> inefficient way to do it, I agree, but I didn't have a better one.


Wait a minute, I think you're talking about the while() loop used to
find the next non-saturated server. I agree this one cannot be removed
otherwise we create a domino effect by overloading a specific server
with all requests initially made for another server. Here I'm talking
about the initial function which is used to compute all relative
capacities (sorry I don't have its name in mind and don't have access
to the sources right now). Since the servers' capacities do not change
during the loop, it's better to update the backend's average load as
servers take or release connections and be able to compute a server's
weight on the fly instead of pre-computing it.

> > Also I've been thinking about this issue of the infinite loop that you
> > solved already. As long as c > 1 I don't think it can happen at all,
> > because for any server having a load strictly greater than the average
> > load, it means there exists at least one server with a load smaller than
> > or equal to the average. Otherwise it means there's no more server in
> > the ring because all servers are down, and then the initial lookup will
> > simply return NULL. Maybe there's an issue with the current lookup
> > method, we'll have to study this.
> 
> Agreed again, it should be impossible as long as c > 1, but I ran into it. I 
> assumed it was some problem or misunderstanding in my code.

Don't worry I trust you, I was trying to figure what exact case could
cause this and couldn't find a single possible case :-/

> > 3) add a configuration option. I still don't know if it would be better
> >    in the backend or in the server. I'm tempted to think it's better to
> >    have it per backend (to avoid the risk that no server it picked) and
> >    to let people continue to use the weights to modulate the acceptation
> >    ratio as they've always been doing.
> > 
> > This last point makes me think that we should probably use "c * s->weight"
> > instead of just "c" so that each server's weight is considered in this
> > ratio. This is quite important otherwise in practice it will not be much
> > possible to apply different weights to servers.
> 
> I will have to think on how to do the weights properly, and I will probably 
> do 
> my first stab without them, but I do follow your logic.

We already use weights everywhere, namely with leastconn. While I have
no problem with reviewing a code update without them, I'd definitely
want to have them for the merge, otherwise we'd create two different
behaviours which would be confusing or even considered as bogus. Let's
reuse the same logic we have at other places (ie: number of conns per
server is divided by their relative weight so that the comparison is
performed on normalized weights).

> As for the configuration parameter, it's okay to add a double and use atof? 
> Normally I wouldn't ask, but since haproxy has no floats at all I'm double-
> checking before introducing them.

That's an excellent question, I didn't think about it. What would you
think about using integer percentages instead ? Many people produce or
update their configs using shell scripts and floats would cause some
trouble there. Using a percentage would work exactly like for weights
that cause no issue at the moment. And I don't think you'll need a
resolution finer than 1% to adjust the relative capacities. Here all
values would be 100 or larger.

Cheers,
Willy

Re: Experimental patch: Consistent Hashing with Bounded Loads

Reply via email to