On 2018/5/5 01:29, Willy Tarreau wrote: > On Fri, May 04, 2018 at 06:49:00PM -0400, Patrick Hemmer wrote: >> I'm not quite following the need for multiple queues. Why wouldn't you >> just have one sorted queue, where if multiple pending requests have the >> same priority, then they're FIFO. > That's what the time-ordered queue does. I proposed this so that you can > ensure that lower priorities are always processed first until there is no > more of the same level. Ah, I was considering the 2 solutions as an either-or choice. Not that you'd use both in the same configuration, or at least in the same backend.
>> The queue was implemented using the existing pendconns code. The only >> thing that was changed was to insert in sorted order. Obviously this is >> just a quick and dirty implementation. There are lots of priority queue >> implementations to choose from, and I don't know which one is best >> suited to this kind of task. We could keep the existing sorted >> linked-list, and just do some optimizations for insertion. Or we could >> switch to some sort of tree. The linked-list has the advantage in that I >> suspect that the vast majority of additions will append to the end, and >> pops are always off the front, so well suited to a linked list. A tree >> might become very unbalanced very fast, and require lots of maintenance. > No, trees are quite cheap, we use them absolutely everywhere now. Just > take a look at wake_expired_tasks() to get an idea. In fact every single > time we've been relying on lists doing linear lookups, we've got several > bug reports about haproxy eating all the CPU because some people were > using it in a way that wasn't initially considered as reasonable but > which made a lot of sense to them. This is why I liked the idea of a single implementation. It gave the flexibility to the user to do whatever they wanted (time or class based). Granted I was only thinking of doing one or the other, but the user could still do both, they would just have to manage dedicating a range to the date/class. Maybe someone would want to consider date before class. > In this case you'd be better of with the principle consisting in using > a larger integer with the priority at the top and the date at the bottom. > But that's where I said that it would require two lookups to deal with > wrapping date so I'm not sure it's any better than using multiple queues. > Well, at least it's less maintenance burden. The detailed principle is > the following : > > uint64_t priority : > > 63 32 31 0 > [ hard priority | date ] > > next = lookup_first(queue); > if (!next) > return; > > => next is the lowest value of both priority and date > > Then you perform a second lookup with the oldest past date for the same > priority level : > > key = (next->key & 0xffffffff00000000ULL) | (now_ms - TIMER_LOOK_BACK); > next2 = lookup_ge(queue, key); > if (next2 && !((next2->value ^ next->value) >> 32)) > next=next2 > > Then you dequeue next which is always the next one. > > That's where I thought that using two distinct levels was simpler, but > if we factor in the fact that it would limit the number of priority > classes and would require more maintenance code, it's probably not > worth the hassle to save the 4 extra lines we have above to respect > the wrapping date. > > Also I'm thinking that we can even use 32-bit by putting the frontier > between the date and the fixed priority (let's call it class) somewhere > else : > - 8 bit class + 24 bit offset => 256 classes and +/- 2.3 hours offsets > - 12 bit class + 20 bit offset => 4096 classes and +/- 8 minutes offsets > - 16 bit class + 16 bit offset => 64k classes and +/- 32 second offsets What's the benefit to using 32-bit over 64-bit, and is that benefit worth the cost of the added code complexity and processing time? If we used 64-bit, we could do a 16/48 bit split and have an absolute timestamp out to year 10889 at millisecond precision. > I'm pretty sure that the two first options are way enough to cover almost > everyone's needs. For me a 12-bit class should work. My intention is to implement something similar to email spam filtering, where you generate a score based on the rules that match, and some rules are worth more than others. 8 bits might be a bit difficult to work within. I also suspect the 16-bit option (32 second offset) is too small for date based. > Let's say for a moment that we'd use the second option > (12+20 bits), the queue insertion code becomes this : > > pendconn->node.key = (stream->prio_class << 20) + > (now_ms + stream->prio_offset) & 0xFFFFF; > if (srv) > eb32_insert(&srv->pendconns, &pendconn->node); > else > eb32_insert(&px->pendconns, &pendconn->node); > > > And the dequeuing code in pendconn_process_next_strm() which takes care > both of the backend's and server's queues while still respecting their > classes would become (modulo locking, which is not trivial there but > which has to be preserved regardless of the solution) : > > srv_node = eb32_first(&srv->pendconns); > prx_node = eb32_first(&prx->pendconns); > if (!srv_node) > return prx_node; > if (!prx_node) > return srv_node; > > if (srv_node->key >> 20 < prx_node->key >> 20) > return srv_node; > > if (srv_node->key >> 20 > prx_node->key >> 20) > return prx_node; > > /* classes are equal, pick oldest expiration first */ > key = (srv_node->key & 0xfff00000) + > (now_ms - (TIMER_LOOK_BACK >> 12)) & 0xFFFFF; > > srv_node2 = eb32_lookup_ge(&srv->pendconns, key); > if (srv_node2 && !((srv_node2->value ^ srv_node->value) >> 20)) > srv_node = srv_node2; > > prx_node2 = eb32_lookup_ge(&prx->pendconns, key); > if (prx_node2 && !((prx_node2->value ^ prx_node->value) >> 20)) > prx_node = prx_node2; > > /* take the first one in time order */ > if ((prx_node->key - srv_node->key) & 0xFFFFF < 0x80000) > return srv_node; > else > return prx_node; > > All of this looks fairly reasonable to me and could possibly remove a > bit of the current complexity in the current function which has to > dereference the streams to find the queuing dates. > > Then the documentation becomes pretty simple : > - all requests from the lowest numbered priority classes are always > processed first > - within a same priority class, requests are processed in order of > arrival, to which a positive or negative time offset an be applied > > We could then have this in http-request and tcp-request : > > http-request set-priority-class int(x) > http-request set-priority-offset int(x) > > > BTW you were right to remind me about using a sample expression instead > of a static integer in the argument. I've been used to use static values > everywhere for years but for such cases, being able to use expressions > is much more useful as we could extract the value from a header or cookie > for example. > > Willy The rest of this all seems fine. I've got no other thoughts to add. So the question is, which route do we go? 1. 32-bit int (12/20 split) 2. 64-bit int (32/32 split) 3. 64-bit int (16/48 split) with absolute timestamps 4. 64-bit int with a single `set-priority` action. Let the user use the a new`ms()` fetch for time based. The user does the bit shifting if they want to use both (using the `mul()` converter, or we can add a `shift()`). -Patrick