On Sat, Aug 31, 2013 at 01:27:41PM +0530, Sachin Shetty wrote:
> We did try consistent hashing, but I found better distribution without it.

That's known and normal.

> We don¹t add or remove servers often so we should be ok.

It depends on what you do with them in fact, because most places will
not accept that the whole farm goes down due to a server falling down
causing 100% redistribution. If you have reverse caches in general it
is not a big issue because the number of objects is very limited and
caches can quickly refill. But outgoing caches generally take ages to
fill up.

> Our total pool is
> sized correctly and we are able to serve 100% requests when we use
> roundrobin, however sticky on host is what causes some nodes to hit
> maxconn. My goal is to never send a 503 as long as we have other nodes
> available which is always the case in our pool.

OK so if we perform the proposed change it will not match your usage
since you're not using consistent hashing anyway. So we might have to
add another explicit option such as loose/strict assignment of the
server. We could have 3 levels BTW :

   - no-queue : find another server if the destination is full
   - loose    : find another server if the destination has reached maxqueue
   - strict   : never switch to another server

I would just like to find how to do something clean for the map-based hash
that you're using without recomputing a map excluding the unusable server(s)
but trying to stick as much as possible to the same servers to optimize hit
rate.

Maybe scanning the table for the next usable server will be enough, though
it will not match the same servers as the ones used in case of a change of
the farm size. This could be a limitation that has to be accepted for this.

Willy


Reply via email to