On Tue, Mar 22, 2016 at 1:59 AM, Sergii Mikhtoniuk <mikhton...@gmail.com> wrote: > Hi, > > My question is more about HTTP/REST in general, but I can't think of a > better place to find experts than this mailing list. > > Can you share your approaches to providing back pressure with HTTP? I'm > talking about handling cases when upstream service is overloaded due to > increased request rate, or due to some problem that reduces its normal > capacity, e.g. hardware issues or database backup. > > To give you more context - we are running a set of RESTful microservices > that are very latency sensitive. We use HAProxy in a fully decentralized > fashion to route the requests (aka SmartStack). Suppose we have N instances > of service A that need to speak to M instances of service B - every instance > of A runs local HAProxy, which is automatically reconfigured whenever we > scale service B up or down. > > This model worked really well for us, and it's also becoming very common in > microservice environments (e.g. container schedulers such as Marathon and > Kubernetes), but somehow noone ever mentions how to implement back pressure > in it. > > The decentralized nature prevents us from being able to use the only back > pressure mechanism HAProxy has - maxconn/maxqueue/timeout queue. Even if we > assume that request distribution is uniform and set A's HAProxy maxconn to > `Capacity(B) / N` this model breaks when we have another service C that also > makes requests to B, and we do have a complex service topology. > > The way we are currently solving this: > - we assume that upstream service (B) is the only place where we know the > actual capacity and current load > - so it's the upstream service that makes a decision whether to accept > request for processing or to decline it > - if request is declined we want HAProxy to try the next server in the list > (`option redispatch`) > - the only way to achieve this (due to use of `splice`) is to prevent TCP > connection from ever being established > - so we use IPTables to set a hard limit on the number of active TCP > connection per service port > - when all `retries` instances of upstream service are busy - we fail > request very fast, allowing caller to perform any load shedding strategy > (asking client to try again, returning cached result etc.) > > This solution worked well for us, but has a number of downsides: > - relies on iptables and conntrack, and a lot of kernel variable tweaking > - does not play well with Keep-Alive connections > - hard to use in containers, especially with network isolation > > So we are looking at replacing it with something that works on application > level protocol. One idea would be: > - have yet another HAProxy, but now on the upstream service side > - it's only purpose will be to limit `maxconn` and maintain a queue of > connections > - ideally implement an Active Queue Management (AQM) strategy such as CoDel > to allow this queue to absorb short-term request bursts, but at the same > time prevent bufferbloat-like "standing queues" > > To summarize: > - Can anyone recommend a better solution for back pressure with > decentralized HAProxy setup? > - Do you think AQM for connection queues will be a good addition to HAProxy? > > > Thanks, > Sergii >
Hi Sergii, You should have a look at the agent-check? HAProxy can poll periodically a small deamon running on your app server. This daemon can return some keyword or percentage to teach HAProxy how healthy it is from a processing capacity point of view. A nice example of the agent-check from percona to lower the weight of mysql slaves server based on the replication lag: https://www.percona.com/blog/2014/12/18/making-haproxy-1-5-replication-lag-aware-in-mysql/ Baptiste