On 2018/8/11 11:25, Aleksandar Lazic wrote:
> Hi Willy.
>
>
> On 11/08/2018 11:04, Willy Tarreau wrote:
>> Hi Aleks,
>>
>> On Sat, Aug 11, 2018 at 10:27:42AM +0200, Aleksandar Lazic wrote:
>>> I think it is a requeue mechanism for the connection and not a
>>> selector for
>>> a server or a backend, but I can be wrong.
>>
>> That's it. I'll try to explain shortly. You're probably aware how the
>> queuing mechanism works in haproxy when maxconn is set on a server :
>> once maxconn is reached, requests are queue either into the server's
>> queue, if something forces this request to use this specific server
>> (cookie, ...) otherwise into the backend's queue. Till now, our queues
>> were very fair (it was already shown on some blog posts from various
>> users over the last decade). A server would always consult both its
>> own queue and the backend's queue and pick the oldest request so that
>> the order of arrival was always preserved.
>>
>> Now with this new feature, it is possible to change the dequeuing
>> order when a server picks a request from the queues. There are two
>> criteria for this :
>>  - the request's class
>>  - the position in the queue for a given class.
>>
>> The server will always consult classes with lowest value first. And
>> within this class, it will pick the oldest waiting request. The
>> "set-priority-class" action allows you to set the class, and mark a
>> request as very important or unimportant. A good rule of thumb is to
>> consider that rare but important requests should be given a low class
>> number, that common requests should be given the default class number
>> (unchanged) and that expensive and painful requests which are not very
>> urgent should be given a high class number (dequeued only once no other
>> request competes with them). The "set-priority-offset" action allows to
>> tweak the apparent age of the request in the queue so that a server can
>> pick a request before another one for a given class. This is not used
>> to deal with the request's importance, but with the preferred response
>> time. For example, if your site periodically refreshes the pages but
>> cannot start to render before all objects are loaded, it could happen
>> that the browser clears the page when reloading the HTML part, then
>> takes ages to fetch css, js, etc, making the page unreadable for several
>> seconds every minute for a reader. By applying a time offset to these
>> extra components, you will make the mandatory elements such as CSS or JS
>> load much faster than the HTML one, resulting in an extra delay before
>> starting to fetch the HTML part (hence the page is not yet cleared), and
>> a shorter delay for the CSS/JS parts (making it render faster once
>> cleared).
>> Similarly you could be a hosting provider willing to offer certain
>> guarantees on the page load time to certain customers. With this you
>> could easily say that those paying for premium service are guaranteed
>> to see a 100 ms faster load time than those paying the entry level one,
>> simply by applying a -100ms offset on their requests.
>>
>> Some people want to use such features the other way around. Certain
>> sites have low importance stuff like an avatar upload button, a profile
>> settings page, etc, on which people "waste their time" instead of making
>> use of the main site, but still making use of the bandwidth and
>> processing
>> power. Sometimes they simply want to say that such auxiliary requests
>> are
>> applied an extra delay when there's competition to reach the servers.
>> Just
>> by adding an extra 1 second offset to such requests, you'll slow them
>> down
>> when the servers are highly loaded, and preserve the resources for the
>> other requests.
>>
>> It's really impressive when you run two load generators in parallel. You
>> will typically see, say, 100 ms response time for both of them by
>> default.
>> Then you apply +100ms to the requests coming from one of them, and then
>> you'll see a distribution like 66/166ms with the load of the slow one
>> dropping as its response time increases, giving more resources to the
>> other one, allowing its traffic to be processed faster.
>>
>> I hope I helped clear your doubts.
>
> Men! yes. As always a super good explanation ;-)
>
>> I'm having a few comments below :
>>
>>> listen|frontend test001
>>>   bind :8080
>>>   # some other options
>>>   http-request set-priority-class -10s if PAY_BUTTON
>>
>> The class just uses integers, not delays. Also it's an interesting case
>> you have here, as many sites prefer to *increase* the delay on the pay
>> button or even once there's anything in your cart. There's a reason :
>> visitors who don't find what they're looking for are likely to quit the
>> site, so you need to improve their experience while they're searching.
>> Once their shopping cart is filled with a few articles, they're not
>> willing to spend as much time looking for them again on another site,
>> so they're very likely to accept to wait (even if that scares them).
>> Thus by delaying their traffic you can improve others' experience. Yes
>> I know it's dirty, but I don't operate a shopping site :-)
>
> ;-)
>
>>>   http-request set-priority-offset 5s if LOGO
>>
>> That's a typically good example, yes, as nobody cares about the logo
>> being loaded fast.
>
> We are both not marketing peoples, these peoples would like to k... us
> for that setup ;-)
>
>>>   # as a sample expression can also be this, right?
>>>   http-request set-priority-class \
>>>     %[req.fhdr(User-Agent),51d.single(DeviceType,IsMobile,IsTablet)]
>>
>> No, it takes a constant expression. I'm not sure whether it would
>> really make sense to make it support sample expressions. I'm not
>> necessarily against it, it's just that I think it comes with extra cost
>> and complexity for very low value in the end. Very likely you can do
>> this using 2 or 3 rules only.
Actually it does take a sample expression, not a constant. This was done
so that you could calculate a priority based on multiple criteria in the
request.
However if desired, we could make it behave like many of the sample
converters, where it either accepts a numeric value, or a variable name.

>
> Agree.
> For that then the doc should be changed, imho.
>
> ###
> +  The "set-priority-class" is used to set the queue priority class of
> the
> +  current request. The value must be a sample expression which
> converts to an
>                                        ^^^^^^^^^^^^
> +  integer in the range -2047..2047.
> ###
>
> How about this?
>
> +  The "set-priority-class" is used to set the queue priority class of
> the
> +  current request. The value must be a integer number between
> -2047..2047.
>
> for offset also
>
> I will create a PR for this as soon as we agree to the wording.
>
>>>   # could this work ?
>>>   use_backend high_priority if priority-class > 5
>>
>> We don't have a sample fetch to retrieve the value, but it would be
>> trivial to implement as the value is present in the stream. Feel free
>> to take a look at how priority-class and priority-offset work for this.
>> Right now you can already achieve this by setting a variable anyway,
>> but I can see value in using the class to take a routing decision.
>
> Well there are two, as far I understand the code proper.
>
> ### src/queue.c
> +static struct sample_fetch_kw_list smp_kws = {ILH, {
> +       { "prio_class", smp_fetch_priority_class, 0, NULL, SMP_T_SINT,
> SMP_USE_INTRN, },
> +       { "prio_offset", smp_fetch_priority_offset, 0, NULL,
> SMP_T_SINT, SMP_USE_INTRN, },
> +       { /* END */},
> ###
>
> I think they are used internally, what's the impact to change the
> "SMP_USE_INTRN" to something which can be used for external lookups?
You're able to use these sample fetches fine. SMP_USE_INTRN doesn't mean
it's not for use by users. It means the information comes from internal
haproxy state, not from the request content.
Although it appears I forgot the documentation for these. Oops. I'll get
that addressed and submit a patch shortly.

>
> Maybe in the next round.
> As you said for now we can set a variable and route based on that.
>
>> Cheers,
>> Willy
>
> Regards
> aleks

To answer one of the earlier questions, I do plan on writing a blog
article yes. The question is when. I'm considering backporting this
feature to 1.8 for my own use at work, and if done, might wait for that
so I have some real world usage to comment on.
Willy's example use cases were spot on. But to add the use case which
triggered me to write this code: The intent is more for layer 7 DOS
mitigation. It's really hard to correctly identify bots vs real users.
You can usually get it right, but mis-identification is very easy. So
the idea here is that we would add a score to incoming requests based on
things like the user having a cookie, are they a registered user,
request rate, etc. Similar to how email spam filters work. Each one of
these things would increment or decrement the score, and then they would
be queued based on the result. Then when we do have a L7 attack, we only
give compute resources to the attackers when there are no real users in
the queue. Thus the users might see some slowdown, but it should be
minimal. And since we're not actually blocking an attacker, it makes it
much harder for them to figure out the criteria we're using to identify
them and get around it. And also since we're not blocking, users which
end up mis-identified as bots aren't impacted during normal operations,
only when we're under attack. And even then it should be minimal since
while a user might have triggered a few of the score rules, the bot
would have hopefully triggered more.


-Patrick

Reply via email to