Re: dequeue on first byte instead of connection close?

Dylan Jay Wed, 14 Dec 2016 22:00:59 -0800

Still no one else has a usecase for a maxconn based when the end of headers are 
received rather than when the connection is closed?
This is still an issue for us and we haven’t found any load balancer that can 
handle it. It leaves our servers under-utilised.


Basically we want to set maxconn = 20 (or some large number) and 
maxconn_processing = 1 (where maxconn_processing is defined as not having 
finished returning headers yet).



> On 19 Mar. 2014, at 11:29 am, Dylan Jay <[email protected]> wrote:
> 
> Hi,
> 
> I was wondering if you'd given any more thought to this feature?
> 
> To summarise:
> - we are using a backend service that has both synchronous and async threads.
> - it has its own internal queuing
> - as soon as haproxy gets the first byte the body we know the backend service 
> is able to accept another connection as it's handed off the streaming to an 
> asynchronous thread
> - the way haproxy works now we can't take advantage of this. The streamed 
> response back to the user could be taking a long time complete, yet our 
> backend service is sitting unoccupied as haproxy believes its already reached 
> its maxconn of 1.
> - setting maxconn higher doesn't solve the problem
> - we can't differentiate the urls such that we have two queues with different 
> maxcon
> 
> What we'd like is something like maxstreamingconn which can be set higher 
> than maxconn. 
> 
> Dylan Jay
> 
> 
> 
> On 19 Nov 2012, at 10:56 pm, Dylan Jay <[email protected]> wrote:
> 
>> On 20/11/2012, at 12:29 AM, Willy Tarreau <[email protected]> wrote:
>> 
>>> On Tue, Nov 20, 2012 at 12:08:12AM +1300, Dylan Jay wrote:
>>>> No not all responses. Zope has an object database with blob support. It 
>>>> only
>>>> applies to images, document, videos etc stored in the database, which is 
>>>> sort
>>>> of like static files. Once it's decide to send the file, the transaction 
>>>> can
>>>> end and the actual sending of the file is handled off to an async thread. 
>>>> It
>>>> doesn't apply when a html template page is being generated since the
>>>> transaction doesn't end till the last bit of the page has been generated.
>>> 
>>> OK, that's becoming a bit tricky then.
>> 
>> I think I'm making it sound more complicated than it is. Basically in
>> most web apps once the status header has been sent then most
>> processing has already happened since otherwise you can't indicate to
>> the browser an error in processing occurred. So status header means
>> generally the server will be ready to receive another request very
>> soon.
>> 
>>> 
>>>> The problem I'm trying to solve is delivery very large blobs like long 
>>>> videos
>>>> which could take a long time to stream and aren't amenable to cache in
>>>> something like varnish. Currently while thats happening the cpu is being
>>>> underutilised.
>>> 
>>> Do you know if there is something in the request that can tell you whether 
>>> the
>>> request is for a large blob or a generated page ? If so, I have a solution 
>>> :-)
>> 
>> :)
>> Only the extension in the URL or perhaps a range header with some
>> video players. But both are hacks and won't apply in all cases which
>> why I'm looking to avoid it.
>> 
>>> 
>>>> I'm trying out setting maxconn to 2 even when dynamic requests will be
>>>> handled synchronously. At least some of the time the request will get
>>>> processed earlier increasing the cpu utilisation.
>>> 
>>> I agree and this is generally what people do when running with such low 
>>> limits.
>>> 
>>>> The risk is a request could
>>>> get stuck behind a slow request and since haproxy has already handed off to
>>>> the backend it can't redistribute it (or it could but then it would get 
>>>> done
>>>> twice).
>>> 
>>> In general the risk is low because if a request gets too slow and times out,
>>> there are big changes that the second request will experience the same fate.
>> 
>> Not really. Some requests are just slow transactions like complex
>> saves or long generated HTML pages. If a request is waiting behind a 2
>> second request when it could have been sent to a server that was just
>> serving video then that's inefficient.
>> 
>> 
>>> 
>>>> but if there was a setting like max-active-requests=1 then that would 
>>>> result
>>>> in better balancing. or perhaps if there was a way to use acls with 
>>>> response
>>>> headers to up the maxconn while serving a video?
>>> 
>>> The maxconn cannot be adjusted that way, it would be a bit dangerous. But 
>>> maybe
>>> we could have a per-server active-request count and use this to offset the
>>> maxconn vs curr_conn computations when deciding whether to dequeue or not.
>>> 
>>> However I still think that playing with maxconn is a bit dangerous because 
>>> I'm
>>> fairly sure that your server has a hard limit you don't want to cross. And
>>> that's the goal of the maxconn setting.
>>> 
>> 
>> But if there was a maxconn and a max processing or max action request
>> limit I'd set maxconn to say 4 and maxprocessing to 1. There would be
>> a limit to the number of blobs it can serve efficiently at once on a
>> single thread.
>> 
>> 
>>> Willy
>

Re: dequeue on first byte instead of connection close?

Reply via email to