On 2018/5/2 11:04, Willy Tarreau wrote:
> On Tue, May 01, 2018 at 09:34:14PM -0400, Patrick Hemmer wrote:
>> Would it be possible to add priority based queuing to haproxy? By this I
>> mean that when a server/backend is full (maxconn), that incoming
>> requests would be added to the queue in a custom order. The idea here is
>> that when the system is under stress, to make sure the important
>> requests get handled first.
> Hehe that's fun that you mention this, as this has been postponed since
> around 1.2 or 1.3! By then we didn't have the equivalent of HTTP rules
> to add/subtract some priority. Now we have everything to do it, we "just"
> need to replace the lists with priority trees in the queues and that's
> all. It's not a big work if someone is interested in working on this.
>
>> In our exact use case, we're looking to use this to help mitigate DOS
>> attacks. The idea is that if a layer 7 attack is saturating the backend
>> servers, we can add logic to prioritize the requests. This logic might
>> be things like requests that have a valid application cookie go to the
>> front of the queue, or requests that come from a cloud provider (e.g.
>> EC2) go to the back of the queue.
> That's exactly why I wanted them to be manipulated vi http-request rules,
> so that everyone can construct his own conditions. Also I found that for
> most shopping sites, having time-based priority is more important than
> position-based : you often want this type of request to be processed 100ms
> faster than another type of request. With HTTP/2 it will be even more
> interesting because it will allow to send the important objects used for
> rendering before the other ones, which is very similar to the H2 priority
> but more fine-grained if you can adjust it on the fly.
>
>> DOS mitigation is hard because while you can write rules to identify
>> requests that are suspicious, you don't want to block them outright as
>> it is possible they might be legitimate. With prioritization, the
>> requests still get through, and are only affected when the backend is
>> saturated. If maxconn is not reached, the prioritization has no effect
>> at all (since queue is empty).
> I wholeheartly agree with you :-)
>
>> I made the change to haproxy and simulated the conditions in a lab, and
>> the strategy appears to work.
>> The change to haproxy was very minor, ~10 lines in queue.c, using
>> `task->nice` as the prioritization key. However my change is a very
>> rough PoC, and not worthy of submission.
> For a rough PoC it's indeed perfectly fine. But for a final design we
> really need a separate offset. I've really been convinced in field about
> using time rather than position, if you want to experiment with this I
> can give you some insights, it's the same in fact.
Can you elaborate on what you're thinking of for a time-based queue?
What I'm imagining you mean is that you would write a rule to set the
max queue time, and haproxy would insert it into the queue sorting on
TIME_NOW() + MAX_QUEUE_TIME. The main difference I see to this approach
vs scoring, is that you ensure that an item doesn't sit on the queue
forever (or whatever `timeout queue` is set to) if higher priority stuff
keeps getting inserted before it.
I don't think this is necessarily a good thing. If you're under a DOS
attack, the goal is to get the good requests processed before any
possible malicious requests. With a time based queue, those malicious
requests will still get processed and starve out the good requests. For
example lets say you're under attack, a bad request comes in with
max_queue_time=1000ms, and then after 999ms elapse, a good request comes
in with max_queue_time=10ms. You have a good request, and a bad request
on the queue, but HAProxy is going to process the bad request first
because its timer is expiring first. Essentially if haproxy is receiving
X good requests per second, and Y bad requests per second, it's still
going to forward X good per second, and Y bad per second, to the backend
server. The only difference is that they're time shifted.
The other thing I could think you mean by time-based is to insert into
the queue sorting on MAX_QUEUE_TIME, just like a score-based queue, but
you would still record TIME_NOW() + MAX_QUEUE_TIME, and would reject
requests that don't get processed by their deadline. Essentially a
per-request version of the `timeout queue` setting. But I don't see any
real value in this.
Or do you mean something else?
>
>> So before continuing any further down this route, I wanted to see if
>> this is something that could make it into HAProxy, and what any thoughts
>> on it might be.
> Absolutely! I've dreamed of it for over a decade, so I'm glad someone
> is willing to take care of it! Just checked, the item was added 12
> years ago to the roadmap file in 1.2.13 on 2006/05/13 by commit 814cbc6
> ("[DOC] added (and updated) the ROADMAP file"). The lines were :
>
> - wait queues replaced for priority-based trees
> - ability to assign a prio based on L7 matching
>
> The goal has not changed since, I'm patient :-)
>
> Willy