Re: Serf and mod_lua and libmill (Was: Re: [Patch] Async write completion for the full connection filter stack)
On 05 Oct 2015, at 5:13 PM, Eric Covenerwrote: > It seems like > extending that to handlers (and implicitly, request filters) is more > the "can't be done" part. Having had a chance to look at the handlers in more detail, I suspect the problem is that async support wasn’t extended far enough. You can return SUSPENDED from a handler, but only on the main request. If you’re in an internal redirect or a subrequest, SUSPENDED is treated as an error. Request filters are already asynchronous, they aren’t a problem. A caller can only take advantage of the async nature of the input filters if the caller itself is asynchronous. It should be possible to use an async input filter from an async output filter simply by changing the sense of the connection and letting the core deal with it. This is next for me, I plan to fix any bugs that stops this being possible. Regards, Graham —
Re: [Patch] Async write completion for the full connection filter stack
On Sun, Oct 4, 2015 at 1:40 PM, Graham Leggettwrote: > > The next bit of this is the ability to safely suspend a filter. > > A suspended filter is a filter that is waiting for some external event, a > callback of some kind, some timer that it might have set, and in the mean > time doesn’t want to be kicked. From the perspective of > ap_filter_should_yield() the filter has data in it and a well behaved > upstream filter should wait before sending anything further. > > The idea behind suspending filters over and above connections is that if two > separate filters want to suspend a connection, what stops them from stomping > on one another? Why suspend a filter rather than the calling handler (on EAGAIN/EWOULDBLOCK)? > > I am thinking of the sharp end of mod_proxy_httpd (and friends) becoming an > async filter that suspends or not based on whether data is available on the > other connection. In the process, mod_proxy becomes asynchronous. It seems that suspending the handlers would avoid rewriting them as filters. Or is your plan to go for something like (mod_)serf? Regards, Yann.
Serf and mod_lua and libmill (Was: Re: [Patch] Async write completion for the full connection filter stack)
I was wondering: how much of what we want to do would be easier if we decided to make serf and/or mod_lua and/or libmill as dependencies for 2.6/3.0? We could leverage Lua (or libmill) as libs implementing some level of coroutines and serf for a more organic bucket impl. Of course, this could also result in the issue we currently have w/ APR in which APR itself lags behind what httpd itself needs and we end up add stuff to httpd anyway... H...
Re: [Patch] Async write completion for the full connection filter stack
On 05 Oct 2015, at 1:13 PM, Yann Ylavicwrote: > Why suspend a filter rather than the calling handler (on EAGAIN/EWOULDBLOCK)? Primarily because there are many filters, but only one handler. What I have in mind is combining an async bandwidth-limitation filter with an async handler. Right now you can’t do that, and you should be able to, that’s the point of the filter stack. > It seems that suspending the handlers would avoid rewriting them as filters. > Or is your plan to go for something like (mod_)serf? Suspending the handler is an option, but it’s less than ideal if the handler suspends the connection while there is data buffered in the filter stack. In this case data writes will be delayed until the source of the data pushes more data through after the connection is woken up again. Ideally outbound data should be pulled out the stack, not pushed into it. Regards, Graham —
Re: Serf and mod_lua and libmill (Was: Re: [Patch] Async write completion for the full connection filter stack)
On 05 Oct 2015, at 3:41 PM, Jim Jagielskiwrote: > Of course, this could also result in the issue we currently > have w/ APR in which APR itself lags behind what httpd itself > needs and we end up add stuff to httpd anyway... > > H... This was my approach: I wanted the moon on a stick. What exactly constituted a moon on a stick? I wanted all of the following: - I wanted to make httpd asynchronous, and decisively allow httpd to join the list of servers that can support hundreds of thousands of concurrent requests across the board, not just as a special case for specific content (files). - I didn’t want to wait. This means I wanted whatever change I made to be backport-able to httpd v2.4. I don’t care to wait 5 years for httpd v2.6 to finally to appear as the default version in stable OS distros. - I wanted to work with what was already there. That meant making the existing ap_filter API and the existing buckets work with backwards compatible changes to httpd and no changes to apr/apr-util. - I wanted a mechanism that was not brittle. This is probably the hardest part of async programming, how do you prevent something spinning or hanging? When the new mechanism is used wrong the thread just becomes synchronous until the data is written and eventually works anyway, which is existing behaviour and not the end of the world. I am aware that some people have said that it couldn’t be done in httpd’s design. I figured let me test that hypothesis by trying anyway, the worst that could happen was that they were right but the server could at least be improved as best it could. I believe however that there is a good chance the moon on a stick has been achieved. Regards, Graham —
Re: [Patch] Async write completion for the full connection filter stack
On Mon, Oct 5, 2015 at 10:37 AM, Graham Leggettwrote: > On 05 Oct 2015, at 4:25 PM, Eric Covener wrote: > >> If content/request filters will now sometimes run on a different >> thread, is this a special kind of major MMN bump? > > > Content/request filters won’t run on a different thread, no. > Sorry, dunno how I got turned around on that -- clear from the first email.
Re: [Patch] Async write completion for the full connection filter stack
On Sun, Sep 7, 2014 at 9:31 PM, Graham Leggettwrote: > The test suite passes with this patch, I need some more eyeballs to verify I > didn’t miss anything. If content/request filters will now sometimes run on a different thread, is this a special kind of major MMN bump?
Re: Serf and mod_lua and libmill (Was: Re: [Patch] Async write completion for the full connection filter stack)
On Mon, Oct 5, 2015 at 10:57 AM, Graham Leggettwrote: > > What exactly constituted a moon on a stick? I wanted all of the following: > > - I wanted to make httpd asynchronous, and decisively allow httpd to join the > list of servers that can support hundreds of thousands of concurrent requests > across the board, not just as a special case for specific content (files). > > - I didn’t want to wait. This means I wanted whatever change I made to be > backport-able to httpd v2.4. I don’t care to wait 5 years for httpd v2.6 to > finally to appear as the default version in stable OS distros. > > - I wanted to work with what was already there. That meant making the > existing ap_filter API and the existing buckets work with backwards > compatible changes to httpd and no changes to apr/apr-util. > > - I wanted a mechanism that was not brittle. This is probably the hardest > part of async programming, how do you prevent something spinning or hanging? > When the new mechanism is used wrong the thread just becomes synchronous > until the data is written and eventually works anyway, which is existing > behaviour and not the end of the world. > > I am aware that some people have said that it couldn’t be done in httpd’s > design. I figured let me test that hypothesis by trying anyway, the worst > that could happen was that they were right but the server could at least be > improved as best it could. I believe however that there is a good chance the > moon on a stick has been achieved. Is it that common for so much of the server to be tied up in write completion, or just a very big problem for some systems? Most of my experience is the opposite problem -- slow (java!) backends rather than clients not keeping up with the firehose. It seems like extending that to handlers (and implicitly, request filters) is more the "can't be done" part.
Re: [Patch] Async write completion for the full connection filter stack
On 05 Oct 2015, at 4:25 PM, Eric Covenerwrote: > If content/request filters will now sometimes run on a different > thread, is this a special kind of major MMN bump? Content/request filters won’t run on a different thread, no. The only filters that will do so are filters that have been updated to use the new async API, and right now that’s mod_ssl and the core network filter (which always had this capability). If it were to become a problem, the whole patch can be disabled by forcing ap_filter_should_yield() to return false. I don’t believe we need a switch at this stage however, but it is there should it be needed. Regards, Graham —
Re: [Patch] Async write completion for the full connection filter stack
> On Oct 4, 2015, at 9:59 AM, Tim Bannisterwrote: > > On 4 Oct 2015, at 12:40, Graham Leggett wrote: >> >> The next bit of this is the ability to safely suspend a filter. > … >> I am thinking of the sharp end of mod_proxy_httpd (and friends) becoming an >> async filter that suspends or not based on whether data is available on the >> other connection. In the process, mod_proxy becomes asynchronous. > > Also super cool mojo. ++1!
Re: Serf and mod_lua and libmill (Was: Re: [Patch] Async write completion for the full connection filter stack)
On 05 Oct 2015, at 5:13 PM, Eric Covenerwrote: > Is it that common for so much of the server to be tied up in write > completion, or just a very big problem for some systems? Most of my > experience is the opposite problem -- slow (java!) backends rather > than clients not keeping up with the firehose. It is not common enough for the server to be in write completion. Until now mod_ssl couldn’t enter write completion because it had no way to yield without writing the entire response successfully to the network. As soon as you added SSL the server got synchronous. The key problem to solve was the “mod_cache problem”. How does a backend write into mod_cache at full speed and then go away as soon as possible, while a slow client eventually reads the response? We can now (soon, when mod_cache is taught to be async like mod_ssl is) do that, and that is very cool. The slow clients that become a killer are those on a dodgy mobile phone connection that ends up using a slot for far longer than they should. If that slot is tied through via the proxy to a slow java app, that sucks even more. > It seems like > extending that to handlers (and implicitly, request filters) is more > the "can't be done" part. I’ve been working from right to left, from the network filter backwards to the handlers, and the solution I’ve explored is “if it can’t be done in a handler, make it possible to be done in a filter instead”. The idea is that the handler sets up the filter stack, and then sends down some data to kick things off (or no data like mod_cache does). At that point the handler exits and we’re now in write completion. The filter now (soon) has the power to suspend the connection if we’re waiting for someone else (like a connection from the proxy, a timer callback, whatever), and also has the power to switch the “sense” of a connection from a write to a read. So the output filter might switch the sense to read and then leave, and then when the read event is triggered and we’re kicked, try do a read on the input filter stack. I suspect that a handler can do the same thing right now - switch the sense to read and then suspend itself. On the next read, the handler will be woken up, and off we go, ready to read once. Rinse repeat until we’re done reading. Regards, Graham —
Re: Serf and mod_lua and libmill (Was: Re: [Patch] Async write completion for the full connection filter stack)
Thx (somewhat behind) :) > On Oct 5, 2015, at 12:24 PM, Graham Leggettwrote: > > On 05 Oct 2015, at 6:10 PM, Jim Jagielski wrote: > >> Is there any reason to not fold this into trunk and start playing >> around? > > I committed it yesterday: > > http://svn.apache.org/r1706669 > http://svn.apache.org/r1706670 > > Regards, > Graham > — >
Re: Serf and mod_lua and libmill (Was: Re: [Patch] Async write completion for the full connection filter stack)
On 05 Oct 2015, at 6:10 PM, Jim Jagielskiwrote: > Is there any reason to not fold this into trunk and start playing > around? I committed it yesterday: http://svn.apache.org/r1706669 http://svn.apache.org/r1706670 Regards, Graham —
Re: Serf and mod_lua and libmill (Was: Re: [Patch] Async write completion for the full connection filter stack)
Achieved!! Once again: this is some super cool mojo!!! > On Oct 5, 2015, at 10:57 AM, Graham Leggettwrote: > > On 05 Oct 2015, at 3:41 PM, Jim Jagielski wrote: > >> Of course, this could also result in the issue we currently >> have w/ APR in which APR itself lags behind what httpd itself >> needs and we end up add stuff to httpd anyway... >> >> H... > > This was my approach: > > I wanted the moon on a stick. > > What exactly constituted a moon on a stick? I wanted all of the following: > > - I wanted to make httpd asynchronous, and decisively allow httpd to join the > list of servers that can support hundreds of thousands of concurrent requests > across the board, not just as a special case for specific content (files). > > - I didn’t want to wait. This means I wanted whatever change I made to be > backport-able to httpd v2.4. I don’t care to wait 5 years for httpd v2.6 to > finally to appear as the default version in stable OS distros. > > - I wanted to work with what was already there. That meant making the > existing ap_filter API and the existing buckets work with backwards > compatible changes to httpd and no changes to apr/apr-util. > > - I wanted a mechanism that was not brittle. This is probably the hardest > part of async programming, how do you prevent something spinning or hanging? > When the new mechanism is used wrong the thread just becomes synchronous > until the data is written and eventually works anyway, which is existing > behaviour and not the end of the world. > > I am aware that some people have said that it couldn’t be done in httpd’s > design. I figured let me test that hypothesis by trying anyway, the worst > that could happen was that they were right but the server could at least be > improved as best it could. I believe however that there is a good chance the > moon on a stick has been achieved. > > Regards, > Graham > — >
Re: [Patch] Async write completion for the full connection filter stack
On 28 Sep 2015, at 2:11 PM, Jim Jagielskiwrote: > This is super cool magic mojo. The next bit of this is the ability to safely suspend a filter. A suspended filter is a filter that is waiting for some external event, a callback of some kind, some timer that it might have set, and in the mean time doesn’t want to be kicked. From the perspective of ap_filter_should_yield() the filter has data in it and a well behaved upstream filter should wait before sending anything further. The idea behind suspending filters over and above connections is that if two separate filters want to suspend a connection, what stops them from stomping on one another? Patch to follow. I am thinking of the sharp end of mod_proxy_httpd (and friends) becoming an async filter that suspends or not based on whether data is available on the other connection. In the process, mod_proxy becomes asynchronous. Regards, Graham —
Re: [Patch] Async write completion for the full connection filter stack
On 4 Oct 2015, at 12:40, Graham Leggettwrote: > > The next bit of this is the ability to safely suspend a filter. … > I am thinking of the sharp end of mod_proxy_httpd (and friends) becoming an > async filter that suspends or not based on whether data is available on the > other connection. In the process, mod_proxy becomes asynchronous. Also super cool mojo. -- Tim Bannister – is...@c8h10n4o2.org.uk
Re: [Patch] Async write completion for the full connection filter stack
On 28 Sep 2015, at 12:18 PM, Graham Leggettwrote: > Well, cracked the async problem for all file buckets and non-morphing > buckets, morphing buckets will currently still be flushed. > > The solution to that seems to be to try keep a morphing bucket from reaching > the connection filters. A morphing bucket can safely be set aside in a > request filter without any problems with pools. If a filter existed at the > end of the request filters that detected morphing buckets and read them > before sending them downstream, that filter could safely set aside the rest > of the morphing bucket without being forced to read the whole thing in (and > thus block). If the filter received a flush the whole thing gets sent and we > block, but that’s what flush means so that’s fine. > > Patch to follow. Here is is. Regards, Graham -- httpd-async-fullstack-ssl7.patch Description: Binary data
Re: [Patch] Async write completion for the full connection filter stack
This is super cool magic mojo. > On Sep 27, 2015, at 2:41 PM, Graham Leggettwrote: > > Hi all, > > I think I have cracked the async problem for both request and connection > output filters with this patch. > > It provides three new functions: > > - ap_filter_reinstate_brigade() - Used at the start of a filter, brigades > that were set aside earlier are reinstated for sending. > > - ap_filter_should_yield() - Returns true if downstream filters have set > aside data. A filter would typically respond by setting aside the data it is > working with and returning in the expectation of being called again. > > - ap_filter_setaside_brigade() - Used at the end of a filter, any brigades > that were not processed can be set aside to continue the job when called > later. > > The magic happens in the MPM itself. The first time > ap_filter_setaside_brigade() is called the filter is added to a hashtable and > a cleanup is registered to have the filter removed when the request and/or > connection is cleared. The MPM iterates through this hashtable, and sends > empty brigades to any filter with setaside data. > > Key to this technique is that existing filters remain unaffected - in the > absence of any changes to a filter the whole brigade will be processed and > sent downstream, and existing behaviour is maintained. Same with FLUSH > buckets - as expected, flush behaviour remains unchanged. > > If however the filters in the chain are able to setaside buckets, they can > defer themselves to the write completion phase which in turn can take full > advantage of the event MPM. These filters will be expected to handle an empty > brigade to “kick” them back into life and continue the writing of their > setaside data. As soon as their setaside brigade becomes empty the kicks then > stop. All filters with setaside data get kicked exactly once, so none of the > filters should get starved. Filters that are removed from the stack get their > setaside emptied, and so become ineligible for kicks. When the pool cleanup > gets triggered, the filter is permanently removed from the connection and > disappears. > > Over and above the network filter the first filter to be modified is mod_ssl, > and this will allow files served over SSL to take advantage of write > completion. Another filter that will benefit from using the above calls is > mod_deflate. > > I have an additional goal of adding an ap_filter_suspend() method that will > allow us to suspend a filter for a given period of time or until some > callback is triggered, but that will be a separate patch. > > Regards, > Graham > -- >
Re: [Patch] Async write completion for the full connection filter stack
On 27 Sep 2015, at 8:41 PM, Graham Leggettwrote: > I think I have cracked the async problem for both request and connection > output filters with this patch. Well, cracked the async problem for all file buckets and non-morphing buckets, morphing buckets will currently still be flushed. The solution to that seems to be to try keep a morphing bucket from reaching the connection filters. A morphing bucket can safely be set aside in a request filter without any problems with pools. If a filter existed at the end of the request filters that detected morphing buckets and read them before sending them downstream, that filter could safely set aside the rest of the morphing bucket without being forced to read the whole thing in (and thus block). If the filter received a flush the whole thing gets sent and we block, but that’s what flush means so that’s fine. Patch to follow. Regards, Graham —
Re: [Patch] Async write completion for the full connection filter stack
Hi all, I think I have cracked the async problem for both request and connection output filters with this patch. It provides three new functions: - ap_filter_reinstate_brigade() - Used at the start of a filter, brigades that were set aside earlier are reinstated for sending. - ap_filter_should_yield() - Returns true if downstream filters have set aside data. A filter would typically respond by setting aside the data it is working with and returning in the expectation of being called again. - ap_filter_setaside_brigade() - Used at the end of a filter, any brigades that were not processed can be set aside to continue the job when called later. The magic happens in the MPM itself. The first time ap_filter_setaside_brigade() is called the filter is added to a hashtable and a cleanup is registered to have the filter removed when the request and/or connection is cleared. The MPM iterates through this hashtable, and sends empty brigades to any filter with setaside data. Key to this technique is that existing filters remain unaffected - in the absence of any changes to a filter the whole brigade will be processed and sent downstream, and existing behaviour is maintained. Same with FLUSH buckets - as expected, flush behaviour remains unchanged. If however the filters in the chain are able to setaside buckets, they can defer themselves to the write completion phase which in turn can take full advantage of the event MPM. These filters will be expected to handle an empty brigade to “kick” them back into life and continue the writing of their setaside data. As soon as their setaside brigade becomes empty the kicks then stop. All filters with setaside data get kicked exactly once, so none of the filters should get starved. Filters that are removed from the stack get their setaside emptied, and so become ineligible for kicks. When the pool cleanup gets triggered, the filter is permanently removed from the connection and disappears. Over and above the network filter the first filter to be modified is mod_ssl, and this will allow files served over SSL to take advantage of write completion. Another filter that will benefit from using the above calls is mod_deflate. I have an additional goal of adding an ap_filter_suspend() method that will allow us to suspend a filter for a given period of time or until some callback is triggered, but that will be a separate patch. Regards, Graham -- httpd-async-fullstack-ssl6.patch Description: Binary data
RE: [Patch] Async write completion for the full connection filter stack
-Original Message- From: Yann Ylavic [mailto:ylavic@gmail.com] Sent: Sonntag, 14. September 2014 22:19 To: httpd Subject: Re: [Patch] Async write completion for the full connection filter stack On Sat, Sep 13, 2014 at 9:23 PM, Ruediger Pluem rpl...@apache.org wrote: How about to require that the caller of ap_filter_setaside_brigade just hands over a non NULL bucket_brigade as buffered_bb (so changing apr_bucket_brigade ** to apr_bucket_brigade *) and that it should handle *deferred_write_pool as opaque structure and that it should not allocate from it? Can that be enforced by the API as I proposed? The opaque structure? Likely. Regards Rüdiger
Re: [Patch] Async write completion for the full connection filter stack
On 13 Sep 2014, at 9:23 PM, Ruediger Pluem rpl...@apache.org wrote: How about to require that the caller of ap_filter_setaside_brigade just hands over a non NULL bucket_brigade as buffered_bb (so changing apr_bucket_brigade ** to apr_bucket_brigade *) and that it should handle *deferred_write_pool as opaque structure and that it should not allocate from it? Or we request the caller to provide a non NULL deferred_write_pool as well (so changing apr_pool_t ** to apr_pool_t *) and warn the caller that the pool might be cleared during the call of ap_filter_setaside_brigade. Hence solving all lifetime issues would be in the responsibility of the caller. There is an issue I hadn’t considered - the effect of calling ap_remove_output_filter(). If we remove the filter with buckets having been set aside, the c-data_in_output_filters will remain unchanged and off-by-one, leading to a spin (it will never count back down to zero). This is a nasty bug to have to chase down. In the attached patch I have investigated adding both buffered_bb and the deferred_write_pool to ap_filter_t, and having ap_remove_output_filter() “do the right thing” when called. Both of these can be optionally set by the caller, but are otherwise not handled by the caller, the filter API worries about cleaning up, and worries about ensuring that c-data_in_output_filters is always accurate. I have also introduced an optimisation where if this is a request filter, we use r-pool to setaside buckets, while if we’re a connection filter we use the deferred write pool. Looks like a similar thing needs to be done with input filters. Regards, Graham — httpd-async-fullstack-ssl4.patch Description: Binary data
Re: [Patch] Async write completion for the full connection filter stack
On 15 Sep 2014, at 2:13 PM, Graham Leggett minf...@sharp.fm wrote: There is an issue I hadn’t considered - the effect of calling ap_remove_output_filter(). If we remove the filter with buckets having been set aside, the c-data_in_output_filters will remain unchanged and off-by-one, leading to a spin (it will never count back down to zero). This is a nasty bug to have to chase down. In the attached patch I have investigated adding both buffered_bb and the deferred_write_pool to ap_filter_t, and having ap_remove_output_filter() “do the right thing” when called. Both of these can be optionally set by the caller, but are otherwise not handled by the caller, the filter API worries about cleaning up, and worries about ensuring that c-data_in_output_filters is always accurate. I have also introduced an optimisation where if this is a request filter, we use r-pool to setaside buckets, while if we’re a connection filter we use the deferred write pool. Looks like a similar thing needs to be done with input filters. Another refinement - deferred_write_pool is now just deferred_pool, and we only manipulate the data_in_output_filters when removing an output filter. Regards, Graham — httpd-async-fullstack-ssl5.patch Description: Binary data
Re: [Patch] Async write completion for the full connection filter stack
On Sat, Sep 13, 2014 at 7:03 PM, Graham Leggett minf...@sharp.fm wrote: On 12 Sep 2014, at 4:57 PM, Yann Ylavic ylavic@gmail.com wrote: Makes sense, however ap_core_output_filter() and ssl_io_filter_output() *know* that their buffered_bb does not contain such bucket (exclusively filled by ap_filter_setaside_brigade()), in that the original code (still in svn) is more optimal since it does not walk through the same brigade multiple times (doing an immediate nonblocking write with what it has and returning). I suppose if we clearly document the behaviour it should be fine. We already force the return of should_write if an empty brigade was passed in (as in, “no more coming for now, just write it”). If an empty brigade is given but buffered_bb (controlled and hence possibly mangled by the user) contains a flush bucket, we will end up writing up to this bucket in blocking mode and the remaining (if any) in nonblocking, while the original code does it all nonblocking (because it knows its buffered_bb is ok). To avoid that and the buffered_bb lifetime issue discussed with Rüdiger, I'm thinking of an opaque type (a struct containing the buffered_bb and the deferred_pool) that could be passed to both ap_filter_setaside_brigade() and ap_filter_reinstate_brigade(). That way no user could modify the content of the brigade or play with the pool (opaque), so that ap_filter_setaside_brigade() could create/clear the deferred_pool as it wish, and ap_filter_reinstate_brigade() could be sure buffered_bb contains only suitable buckets. Are we not just trying to solve a dangling pointer to a brigade by replacing it with a dangling pointer to apr_filter_aside_t? I don't think so, the only way to free an apr_filter_aside_t (outside util_filter.c) is to clear c-pool, which any sane filter will never do (no need to document, that's httpd's job). Either the apr_filter_aside_t* is NULL, or it's fields are solely controlled by ap_filter_setaside_brigade() / ap_filter_reinstate_brigade(). Do the right thing in these functions and everything is fine. Regards, Yann.
Re: [Patch] Async write completion for the full connection filter stack
On Sat, Sep 13, 2014 at 9:23 PM, Ruediger Pluem rpl...@apache.org wrote: How about to require that the caller of ap_filter_setaside_brigade just hands over a non NULL bucket_brigade as buffered_bb (so changing apr_bucket_brigade ** to apr_bucket_brigade *) and that it should handle *deferred_write_pool as opaque structure and that it should not allocate from it? Can that be enforced by the API as I proposed? Or we request the caller to provide a non NULL deferred_write_pool as well (so changing apr_pool_t ** to apr_pool_t *) and warn the caller that the pool might be cleared during the call of ap_filter_setaside_brigade. Hence solving all lifetime issues would be in the responsibility of the caller. IMHO that would complicate the caller's job/code, probably doing what's in ap_filter_setaside_brigade() now (use a subpool to avoid the leak, manage the cleanup..). The function would become a simple ap_save_brigade() call (and data_in_output_filter++), no real gain... Regards, Yann.
Re: [Patch] Async write completion for the full connection filter stack
On 12 Sep 2014, at 7:56 AM, Ruediger Pluem rpl...@apache.org wrote: Agreed - as rpluem suggested a pool cleanup should be registered for this, I’ve added it. I know that I proposed this, but rethinking I see one risk with the cleanup. Take the following code with long_living_apr_bucket_brigade_ptr being NULL in the beginning: apr_bucket_brigade *stack_variable; stack_variable = long_living_apr_bucket_brigade_ptr; ap_save_brigade(f, stack_variable, bb, *deferred_write_pool); long_living_apr_bucket_brigade_ptr = stack_variable; I guess in this case this could cause unpredictable behaviour as the stack of the calling function is likely to be repurposed when the pool dies. Do we see that as something that we need to consider or do we see this as a bug on caller side? If we see it as a bug on caller side we should probably document our expectations. Hmmm… It does somewhat limit the caller if we force them to put the pointer somewhere not-on-the-stack. Maybe we should remove the cleanup and document the expectation that the caller must clean it up. In most cases the caller is allocating the space from the parent pool anyway, so it all gets cleaned up regardless. Regards, Graham —
Re: [Patch] Async write completion for the full connection filter stack
On 09/13/2014 07:05 PM, Graham Leggett wrote: On 12 Sep 2014, at 7:56 AM, Ruediger Pluem rpl...@apache.org wrote: Agreed - as rpluem suggested a pool cleanup should be registered for this, I’ve added it. I know that I proposed this, but rethinking I see one risk with the cleanup. Take the following code with long_living_apr_bucket_brigade_ptr being NULL in the beginning: apr_bucket_brigade *stack_variable; stack_variable = long_living_apr_bucket_brigade_ptr; ap_save_brigade(f, stack_variable, bb, *deferred_write_pool); long_living_apr_bucket_brigade_ptr = stack_variable; I guess in this case this could cause unpredictable behaviour as the stack of the calling function is likely to be repurposed when the pool dies. Do we see that as something that we need to consider or do we see this as a bug on caller side? If we see it as a bug on caller side we should probably document our expectations. Hmmm… It does somewhat limit the caller if we force them to put the pointer somewhere not-on-the-stack. Maybe we should remove the cleanup and document the expectation that the caller must clean it up. In most cases the caller is allocating the space from the parent pool anyway, so it all gets cleaned up regardless. How about to require that the caller of ap_filter_setaside_brigade just hands over a non NULL bucket_brigade as buffered_bb (so changing apr_bucket_brigade ** to apr_bucket_brigade *) and that it should handle *deferred_write_pool as opaque structure and that it should not allocate from it? Or we request the caller to provide a non NULL deferred_write_pool as well (so changing apr_pool_t ** to apr_pool_t *) and warn the caller that the pool might be cleared during the call of ap_filter_setaside_brigade. Hence solving all lifetime issues would be in the responsibility of the caller. Regards Rüdiger
Re: [Patch] Async write completion for the full connection filter stack
On Thu, Sep 11, 2014 at 5:20 PM, Graham Leggett minf...@sharp.fm wrote: On 11 Sep 2014, at 2:12 PM, Yann Ylavic ylavic@gmail.com wrote: I would still do the flush check anyway - the ap_filter_reinstate_brigade() has no idea what happened outside this function, we may have flushed buckets, we may not have and setaside those flush buckets for whatever reason, and we want the behaviour to remain predictable. Makes sense, however ap_core_output_filter() and ssl_io_filter_output() *know* that their buffered_bb does not contain such bucket (exclusively filled by ap_filter_setaside_brigade()), in that the original code (still in svn) is more optimal since it does not walk through the same brigade multiple times (doing an immediate nonblocking write with what it has and returning). To avoid that and the buffered_bb lifetime issue discussed with Rüdiger, I'm thinking of an opaque type (a struct containing the buffered_bb and the deferred_pool) that could be passed to both ap_filter_setaside_brigade() and ap_filter_reinstate_brigade(). That way no user could modify the content of the brigade or play with the pool (opaque), so that ap_filter_setaside_brigade() could create/clear the deferred_pool as it wish, and ap_filter_reinstate_brigade() could be sure buffered_bb contains only suitable buckets. Something like : # util_filter.h typedef struct ap_filter_aside ap_filter_aside_t; AP_DECLARE(apr_status_t) ap_filter_setaside_brigade(ap_filter_t *f, apr_filter_aside_t **aside, apr_bucket_brigade *bb); AP_DECLARE(int) ap_filter_reinstate_brigade(ap_filter_t *f, apr_filter_aside_t *aside, apr_bucket_brigade *bb, apr_bucket **flush_upto); # util_filter.c struct ap_filter_aside { apr_bucket_brigade *bb; apr_pool_t *pool; }; AP_DECLARE(apr_status_t) ap_filter_setaside_brigade(ap_filter_t *f, ap_filter_aside_t **aside, apr_pool_t **deferred_write_pool, apr_bucket_brigade *bb) { if (!APR_BRIGADE_EMPTY(bb)) { if (!*aside) { *aside = apr_palloc(f-c-pool, sizeof **aside); apr_pool_create((*aside)-pool, f-c-pool); apr_pool_tag((*aside)-pool, deferred_write); (*aside)-bb = NULL; } if (!(*aside)-bb || APR_BRIGADE_EMPTY((*aside)-bb))) { f-c-data_in_output_filters++; } if (bb != (*aside)-bb) { return ap_save_brigade(f, (*aside)-bb, bb, (*aside)-pool); } } else if ((*aside)-pool) { /* * There are no more requests in the pipeline. We can just clear the * pool. */ ((*aside)-bb = NULL; apr_pool_clear((*aside)-pool); } return APR_SUCCESS; } AP_DECLARE(int) ap_filter_reinstate_brigade(ap_filter_t *f, ap_filter_aside_t *aside, apr_bucket_brigade *bb, apr_bucket **flush_upto) { apr_bucket *bucket, *next; apr_size_t bytes_in_brigade, non_file_bytes_in_brigade; int eor_buckets_in_brigade, morphing_bucket_in_brigade; int loglevel = ap_get_conn_module_loglevel(f-c, APLOG_MODULE_INDEX); int empty = APR_BRIGADE_EMPTY(bb); if ((*aside)-bb !APR_BRIGADE_EMPTY((*aside)-bb)) { APR_BRIGADE_PREPEND(bb, (*aside)-bb); f-c-data_in_output_filters--; if (empty) { return 1; } } ... } # core_filters and mod_ssl use a ap_filter_aside_t* instead of buffered_bb and deferred_write_pool. WDYT?
Re: [Patch] Async write completion for the full connection filter stack
This was just a POC, with coding mistakes... On Fri, Sep 12, 2014 at 4:57 PM, Yann Ylavic ylavic@gmail.com wrote: [...] AP_DECLARE(apr_status_t) ap_filter_setaside_brigade(ap_filter_t *f, ap_filter_aside_t **aside, apr_pool_t **deferred_write_pool, Forgot to remove this arg. apr_bucket_brigade *bb) [...] AP_DECLARE(int) ap_filter_reinstate_brigade(ap_filter_t *f, ap_filter_aside_t *aside, apr_bucket_brigade *bb, apr_bucket **flush_upto) { [...] if ((*aside)-bb !APR_BRIGADE_EMPTY((*aside)-bb)) { APR_BRIGADE_PREPEND(bb, (*aside)-bb); Of course here, read aside- instead of (*aside)-. f-c-data_in_output_filters--; if (empty) { return 1; } } ... } Regards, Yann.
Re: [Patch] Async write completion for the full connection filter stack
On Fri, Sep 12, 2014 at 4:57 PM, Yann Ylavic ylavic@gmail.com wrote: AP_DECLARE(int) ap_filter_reinstate_brigade(ap_filter_t *f, ap_filter_aside_t *aside, apr_bucket_brigade *bb, apr_bucket **flush_upto) { apr_bucket *bucket, *next; apr_size_t bytes_in_brigade, non_file_bytes_in_brigade; int eor_buckets_in_brigade, morphing_bucket_in_brigade; int loglevel = ap_get_conn_module_loglevel(f-c, APLOG_MODULE_INDEX); int empty = APR_BRIGADE_EMPTY(bb); if ((*aside)-bb !APR_BRIGADE_EMPTY((*aside)-bb)) { APR_BRIGADE_PREPEND(bb, (*aside)-bb); f-c-data_in_output_filters--; if (empty) { return 1; } } The (empty) check could even be done outside the block, in any case. ... }
Re: [Patch] Async write completion for the full connection filter stack
On Fri, Sep 12, 2014 at 4:57 PM, Yann Ylavic ylavic@gmail.com wrote: AP_DECLARE(apr_status_t) ap_filter_setaside_brigade(ap_filter_t *f, ap_filter_aside_t **aside, apr_bucket_brigade *bb) { if (!APR_BRIGADE_EMPTY(bb)) { if (!*aside) { *aside = apr_palloc(f-c-pool, sizeof **aside); apr_pool_create((*aside)-pool, f-c-pool); apr_pool_tag((*aside)-pool, deferred_write); (*aside)-bb = NULL; To avoid a (small) possible leak here, maybe replace this by : apr_pool_t *p; apr_pool_create(p, f-c-pool); apr_pool_tag(p, deferred_write); *aside = apr_palloc(p, sizeof **aside); (*aside)-bb = NULL; (*aside)-pool = p; } if (!(*aside)-bb || APR_BRIGADE_EMPTY((*aside)-bb))) { f-c-data_in_output_filters++; } if (bb != (*aside)-bb) { return ap_save_brigade(f, (*aside)-bb, bb, (*aside)-pool); } } else if ((*aside)-pool) { /* * There are no more requests in the pipeline. We can just clear the * pool. */ ((*aside)-bb = NULL; apr_pool_clear((*aside)-pool); And this by : apr_pool_t *p = (*aside)-pool; /* * There are no more requests in the pipeline. We can just clear the * pool. */ apr_pool_clear(p); *aside = apr_palloc(p, sizeof **aside); (*aside)-bb = NULL; (*aside)-pool = p; } return APR_SUCCESS; } That wouldn't allow the use of *aside in a f-c-pool cleanup (post), but this seems unlikely.
Re: [Patch] Async write completion for the full connection filter stack
On 10 Sep 2014, at 18:19, Jim Jagielski j...@jagunet.com wrote: On Sep 10, 2014, at 12:07 PM, Graham Leggett minf...@sharp.fm wrote: Having thought long and hard about this, giving filters an opportunity to write has nothing to do with either data or metadata, we just want to give the filter an opportunity to write. “Dear filter, please wake up and if you’ve setaside data please write it, otherwise do nothing, thanks”. Empty brigade means “do nothing new”. Hey filter, here is an empty brigade, meaning I have no data for you; if you have setaside data, now is the time to push it thru. If someone is new to writing filters, I think that's a bit of a gotcha. An empty brigade meaning “do nothing new” is easy to code for and unsurprising. I'm thinking about how I'd help httpd cope with a filter that didn't know about this special behaviour. I would introduce a new bucket: NUDGE, and use that to wake up filters. If a filter doesn't understand NUDGE, this could cause a hang from the client's point of view. To avoid this, I'd have a configurable timeout after which a NUDGE-d filter would get a FLUSH bucket - say 30 seconds as a default, nice and high because FLUSH is an expensive operation. Server admins could drop this if they know that they have a problematic filter. If a filter gets a NUDGE and returns APR_SUCCESS I think it makes sense to NUDGE the next downstream filter. A new metadata bucket has all sorts of compatibility issues that need thinking about (and I don't know enough about). Changing the meaning of “empty brigade” also has compatibility issues but they will show up much later than build time. -- Tim Bannister – is...@jellybaby.net
RE: [Patch] Async write completion for the full connection filter stack
From: Graham Leggett [mailto:minf...@sharp.fm] Sent: Donnerstag, 11. September 2014 06:40 To: dev@httpd.apache.org Subject: Re: [Patch] Async write completion for the full connection filter stack On 11 Sep 2014, at 1:51 AM, Yann Ylavic ylavic@gmail.commailto:ylavic@gmail.com wrote: +else if (*deferred_write_pool) { +/* + * There are no more requests in the pipeline. We can just clear the + * pool. + */ Shouldn't *buffered_bb be set to NULL here when *deferred_write_pool == (*buffered_bb)-p, or more generally apr_pool_is_ancestor(*deferred_write_pool, (*buffered_bb)-p). We can't leave a dangling pointer in this case. +apr_pool_clear(*deferred_write_pool); Hmmm... this came from the original code. We can't set buffered_bb to NULL unless we are sure we created buffered_bb, and this isn't necessarily the case. In the core filter, buffered_bb is created when the connection is created. How about doing a apr_brigade_cleanup(buffered_bb) before the apr_pool_clear? Regards Rüdiger
RE: [Patch] Async write completion for the full connection filter stack
-Original Message- From: Yann Ylavic [mailto:ylavic@gmail.com] Sent: Donnerstag, 11. September 2014 09:52 To: httpd Subject: Re: [Patch] Async write completion for the full connection filter stack On Thu, Sep 11, 2014 at 6:39 AM, Graham Leggett minf...@sharp.fm wrote: On 11 Sep 2014, at 1:51 AM, Yann Ylavic ylavic@gmail.com wrote: +else if (*deferred_write_pool) { +/* + * There are no more requests in the pipeline. We can just clear the + * pool. + */ Shouldn't *buffered_bb be set to NULL here when *deferred_write_pool == (*buffered_bb)-p, or more generally apr_pool_is_ancestor(*deferred_write_pool, (*buffered_bb)-p). We can't leave a dangling pointer in this case. +apr_pool_clear(*deferred_write_pool); Hmmm… this came from the original code. We can’t set buffered_bb to NULL unless we are sure we created buffered_bb, and this isn’t necessarily the case. In the core filter, buffered_bb is created when the connection is created. Another question is if we should clear a pool we haven't created. If we don't then apr_pool_is_ancestor(*deferred_write_pool, (*buffered_bb)-p) should be a save bet that we created the bucket brigade. Another approach below. If we or any filter has created the brigade on *deferred_write_pool (or an ancestor, I'm only talking about this case), it is dead now (unaccessible), and IMO should be marked as so. In the case the caller let the function handle the brigade (as we do), using NULL the first time only, (s)he expects to not crash whenever (s)he come back with it after a clear (which (s)he isn't supposed to know about). Agree. Maybe if buffered_bb is NULL we should add a pool clean up to *deferred_write_pool that sets *buffered_bb to NULL Another solution would be to not require a pool at all, something like : I fear that this creates high memory consumption / temporary memory leaks. Regards Rüdiger
Re: [Patch] Async write completion for the full connection filter stack
On Thu, Sep 11, 2014 at 9:51 AM, Yann Ylavic ylavic@gmail.com wrote: On Thu, Sep 11, 2014 at 6:39 AM, Graham Leggett minf...@sharp.fm wrote: I don’t follow - that means something different as I’m reading it. We must only signal that we have less data in the output filters if we actually originally had data in the output filters, ie buffered_bb existed and wasn’t empty. You're right, I misread ap_core_output_filter() where I thought ap_filter_reinstate_brigade() wasn't called when APR_BRIGADE_EMPTY(bb). Since ap_filter_reinstate_brigade() can be called with an empty bb, ap_core_output_filter() can (still) be simplified though, ie : should_write = ap_filter_reinstate_brigade(f, ctx-buffered_bb, bb, flush_upto); if (APR_BRIGADE_EMPTY(bb)) { return APR_SUCCESS; } if (flush_upto != NULL) { ... } if (should_write) { ... } flush_upto will always be NULL with an empty bb. Again I missed something... This code is indeed what I'd like ap_core_output_filter() (or any filter) to be able to do, but it won't work without a slight change in ap_filter_reinstate_brigade(), which is : AP_DECLARE(int) ap_filter_reinstate_brigade(ap_filter_t *f, apr_bucket_brigade *buffered_bb, apr_bucket_brigade *bb, apr_bucket **flush_upto) { apr_bucket *bucket, *next; apr_size_t bytes_in_brigade, non_file_bytes_in_brigade; int eor_buckets_in_brigade, morphing_bucket_in_brigade; int loglevel = ap_get_conn_module_loglevel(f-c, APLOG_MODULE_INDEX); *flush_upto = NULL; if ((buffered_bb != NULL) !APR_BRIGADE_EMPTY(buffered_bb)) { int flush = APR_BRIGADE_EMPTY(bb); APR_BRIGADE_PREPEND(bb, buffered_bb); f-c-data_in_output_filters--; if (flush) { return 1; } } ... } The new thing is the flush check to return 1 immediatly when an empty bb is given. In this case the intent of the caller is to use what we have (push it thru would say Jim), we don't need to walk the buffered_bb again. That's, I think, what the original code did in ap_core_output_filter() to send_brigade_nonblocking() immediatly in this case, and what the current patch misses. Thoughts?
Re: [Patch] Async write completion for the full connection filter stack
On Thu, Sep 11, 2014 at 2:12 PM, Yann Ylavic ylavic@gmail.com wrote: AP_DECLARE(int) ap_filter_reinstate_brigade(ap_filter_t *f, apr_bucket_brigade *buffered_bb, apr_bucket_brigade *bb, apr_bucket **flush_upto) { apr_bucket *bucket, *next; apr_size_t bytes_in_brigade, non_file_bytes_in_brigade; int eor_buckets_in_brigade, morphing_bucket_in_brigade; int loglevel = ap_get_conn_module_loglevel(f-c, APLOG_MODULE_INDEX); *flush_upto = NULL; Oups, this is an unrelated change, but probably good to have too.
Re: [Patch] Async write completion for the full connection filter stack
On 11 Sep 2014, at 9:51 AM, Yann Ylavic ylavic@gmail.com wrote: If we or any filter has created the brigade on *deferred_write_pool (or an ancestor, I'm only talking about this case), it is dead now (unaccessible), and IMO should be marked as so. In the case the caller let the function handle the brigade (as we do), using NULL the first time only, (s)he expects to not crash whenever (s)he come back with it after a clear (which (s)he isn't supposed to know about). Agreed - as rpluem suggested a pool cleanup should be registered for this, I’ve added it. The core and mod_ssl deferred_write_pool won't be needed anymore. The reason the deferred write pool exists is that connections might live for a very long time. Any memory allocated during that time will accumulate and won’t be cleaned up until the connection is closed. The deferred write pool allows us to temporarily create data during the connection and then clean it out as soon as we no longer need it. You're right, I misread ap_core_output_filter() where I thought ap_filter_reinstate_brigade() wasn't called when APR_BRIGADE_EMPTY(bb). Since ap_filter_reinstate_brigade() can be called with an empty bb, ap_core_output_filter() can (still) be simplified though, ie : should_write = ap_filter_reinstate_brigade(f, ctx-buffered_bb, bb, flush_upto); if (APR_BRIGADE_EMPTY(bb)) { return APR_SUCCESS; } if (flush_upto != NULL) { ... } if (should_write) { ... } flush_upto will always be NULL with an empty bb. This is indeed true - I’ve simplified it as above. Regards, Graham —
Re: [Patch] Async write completion for the full connection filter stack
On 11 Sep 2014, at 2:12 PM, Yann Ylavic ylavic@gmail.com wrote: Again I missed something... This code is indeed what I'd like ap_core_output_filter() (or any filter) to be able to do, but it won't work without a slight change in ap_filter_reinstate_brigade(), which is : AP_DECLARE(int) ap_filter_reinstate_brigade(ap_filter_t *f, apr_bucket_brigade *buffered_bb, apr_bucket_brigade *bb, apr_bucket **flush_upto) { apr_bucket *bucket, *next; apr_size_t bytes_in_brigade, non_file_bytes_in_brigade; int eor_buckets_in_brigade, morphing_bucket_in_brigade; int loglevel = ap_get_conn_module_loglevel(f-c, APLOG_MODULE_INDEX); *flush_upto = NULL; if ((buffered_bb != NULL) !APR_BRIGADE_EMPTY(buffered_bb)) { int flush = APR_BRIGADE_EMPTY(bb); APR_BRIGADE_PREPEND(bb, buffered_bb); f-c-data_in_output_filters--; if (flush) { return 1; } } ... } The new thing is the flush check to return 1 immediatly when an empty bb is given. In this case the intent of the caller is to use what we have (push it thru would say Jim), we don't need to walk the buffered_bb again. That's, I think, what the original code did in ap_core_output_filter() to send_brigade_nonblocking() immediatly in this case, and what the current patch misses. Thoughts? We need this yes - if we’ve been called with an empty brigade and we still have a buffered brigade that is “too small to write” we want to just write that remaining brigade regardless of it’s size, otherwise we spin. I would still do the flush check anyway - the ap_filter_reinstate_brigade() has no idea what happened outside this function, we may have flushed buckets, we may not have and setaside those flush buckets for whatever reason, and we want the behaviour to remain predictable. I’ve updated the patch to force the should_write if the original brigade was empty. Regards, Graham — httpd-async-fullstack-ssl3.patch Description: Binary data
Re: [Patch] Async write completion for the full connection filter stack
On 09/11/2014 05:20 PM, Graham Leggett wrote: On 11 Sep 2014, at 9:51 AM, Yann Ylavic ylavic@gmail.com wrote: If we or any filter has created the brigade on *deferred_write_pool (or an ancestor, I'm only talking about this case), it is dead now (unaccessible), and IMO should be marked as so. In the case the caller let the function handle the brigade (as we do), using NULL the first time only, (s)he expects to not crash whenever (s)he come back with it after a clear (which (s)he isn't supposed to know about). Agreed - as rpluem suggested a pool cleanup should be registered for this, I’ve added it. I know that I proposed this, but rethinking I see one risk with the cleanup. Take the following code with long_living_apr_bucket_brigade_ptr being NULL in the beginning: apr_bucket_brigade *stack_variable; stack_variable = long_living_apr_bucket_brigade_ptr; ap_save_brigade(f, stack_variable, bb, *deferred_write_pool); long_living_apr_bucket_brigade_ptr = stack_variable; I guess in this case this could cause unpredictable behaviour as the stack of the calling function is likely to be repurposed when the pool dies. Do we see that as something that we need to consider or do we see this as a bug on caller side? If we see it as a bug on caller side we should probably document our expectations. Regards Rüdiger
Re: [Patch] Async write completion for the full connection filter stack
On 09 Sep 2014, at 9:19 PM, Plüm, Rüdiger, Vodafone Group ruediger.pl...@vodafone.com wrote: I spent a lot of time going down this path of having a dedicated metabucket, and quickly got bogged down in complexity. The key problem was what does a filter actually do when you get one, it was unclear Don't we have the same problem with an empty brigade? Some filters are not going to handle it as we expect. Hence the additional logic in ap_pass_brigade. I guess the minimum behavior we need to get from every filter is to ignore and pass on. We don’t have the same problem with an empty brigade, no. Empty brigades “fail safe” in that a naive filter will simply try and loop past the brigade, exiting immediately having done nothing. Meta buckets are inherently unsafe in comparison, because if a module doesn’t pass the meta bucket downstream for whatever buggy reason, the server will start to spin. We could use compensation code to stop this, but the compensation code would be complex. Testing for an empty brigade is cheap and trivial, looping through a brigade to search for NOOP buckets so as to trigger bypass of the compensation code is not so trivial. and it made my head bleed. That makes life hard for module authors and that is bad. As I recall there were also broken filters out there that only knew about FLUSH and EOS buckets (eg ap_http_chunk_filter()). We already have additional metabuckets like error buckets or EOR. So I don't see an issue creating a new one. Any filter not passing a meta bucket that is does not understand or even try to process it is simply broken. Agreed that such a filter is broken, but such filters exist (the chunk filter being one example, if I looked harder I would find others) and we need to work within this limitation. Having thought long and hard about this, giving filters an opportunity to write has nothing to do with either data or metadata, we just want to give the filter an opportunity to write. “Dear filter, please wake up and if you’ve setaside data please write it, otherwise do nothing, thanks”. Empty brigade means “do nothing new”. The problem we're trying to solve is one of starvation - no filters can set aside data for later (except core via the NULL hack), because there is no guarantee that they'll ever be called again later. You have to write it now, or potentially write it never. The start of the solution is ensure filters aren't starved: if you have data in the output filters - and obviously you have no idea which filters have setaside data - you need a way to wake them all up. The simplest and least disruptive way is to pass them all an empty brigade, job done. We've got precedent for this - we've been sending NULL to the core filter to achieve the same But this is *our* filter and it will not hit any custom filters. So we can do this kind of hacky game here. thing, we want something that works with any filter. Yes, and this is the reason why I still believe a meta bucket is better. I suspect it is not completely clear how the patch works. Previously, in the core, we handled write completion by passing NULL to the last filter in the chain, the core output filter. This worked fine for simple requests, but as soon as another filter was involved - primarily mod_ssl - there was no write completion. The work of mod_ssl had to be completed in its entirety before write completion would kick in, and by that point it was too late to be useful. The fix for this problem starts by including the entire connection filter chain into write completion, not just the last filter. That means every connection filter both httpd internal and external can take advantage of write completion, not just the core filter. So what do you pass during this write completion phase? Obviously not data, you’ve finished passing the data already when the handler exited. We can’t pass NULL like the core filter has, because all filters would just segfault. This leaves an obvious choice - an empty brigade. We have a new problem - not all filters pass data downstream for one of many reasons. Maybe they’re buggy. Maybe they’re buffering data. Maybe that’s what the filter believes it was asked to do. So we must compensate for this by “waking up” filters downstream of the filter that didn’t write. All we need to do to those filters is wake them up, that’s it. At this point we’re still not done - we need cooperation from filters in order to support write completion, a filter must yield and setaside the data it was given to be called back later, just like the core does. This gives other requests an opportunity to get CPU time. It turns out the algorithm to achieve this yield is built into the core filter, so the next step is to generalise the algorithm and make it available to any filter, starting with mod_ssl. The attached patch does that. Regards, Graham — httpd-async-fullstack-ssl.patch Description: Binary data
Re: [Patch] Async write completion for the full connection filter stack
On Sep 10, 2014, at 12:07 PM, Graham Leggett minf...@sharp.fm wrote: Having thought long and hard about this, giving filters an opportunity to write has nothing to do with either data or metadata, we just want to give the filter an opportunity to write. “Dear filter, please wake up and if you’ve setaside data please write it, otherwise do nothing, thanks”. Empty brigade means “do nothing new”. Hey filter, here is an empty brigade, meaning I have no data for you; if you have setaside data, now is the time to push it thru.
Re: [Patch] Async write completion for the full connection filter stack
On Wed, Sep 10, 2014 at 6:07 PM, Graham Leggett minf...@sharp.fm wrote: The attached patch does that. Hi Graham, nice patch, thank you. A few comments below... +AP_DECLARE(apr_status_t) ap_filter_setaside_brigade(ap_filter_t *f, + apr_bucket_brigade **buffered_bb, +apr_pool_t **deferred_write_pool, +apr_bucket_brigade *bb) +{ +if (bb == NULL) { Is this test really needed? This does not seem to be possible with you patch, and now this is part of the API, a crash if often better in this case (like for any other ap_filter_*() function which is given a NULL bb, or the ap[r] ones in general which tend -sanely- to crash where the code is faulty). +return APR_SUCCESS; +} +if (!APR_BRIGADE_EMPTY(bb)) { +if (APR_BRIGADE_EMPTY((*buffered_bb))) { +f-c-data_in_output_filters++; +} +if (bb != *buffered_bb) { +if (!(*deferred_write_pool)) { +apr_pool_create(deferred_write_pool, f-c-pool); +apr_pool_tag(*deferred_write_pool, deferred_write); +} +return ap_save_brigade(f, buffered_bb, bb, +*deferred_write_pool); +} +} +else if (*deferred_write_pool) { +/* + * There are no more requests in the pipeline. We can just clear the + * pool. + */ Shouldn't *buffered_bb be set to NULL here when *deferred_write_pool == (*buffered_bb)-p, or more generally apr_pool_is_ancestor(*deferred_write_pool, (*buffered_bb)-p). We can't leave a dangling pointer in this case. +apr_pool_clear(*deferred_write_pool); +} +return APR_SUCCESS; +} + +AP_DECLARE(int) ap_filter_reinstate_brigade(ap_filter_t *f, +apr_bucket_brigade *buffered_bb, +apr_bucket_brigade *bb, +apr_bucket **flush_upto) +{ +apr_bucket *bucket, *next; +apr_size_t bytes_in_brigade, non_file_bytes_in_brigade; +int eor_buckets_in_brigade, morphing_bucket_in_brigade; +int loglevel = ap_get_conn_module_loglevel(f-c, APLOG_MODULE_INDEX); + +if ((buffered_bb != NULL) !APR_BRIGADE_EMPTY(buffered_bb)) { +APR_BRIGADE_PREPEND(bb, buffered_bb); +f-c-data_in_output_filters--; Maybe we could use instead : if (!APR_BRIGADE_EMPTY(bb)) { f-c-data_in_output_filters--; } APR_BRIGADE_PREPEND(bb, buffered_bb); so that it remains symetric with ap_filter_setaside_brigade() above (which increments c-data_in_output_filters). It also has the advantage to simplify the calling code in ap_core_output_filter(), where we could simply do : if ((ctx-buffered_bb != NULL) !APR_BRIGADE_EMPTY(ctx-buffered_bb)) { should_write = ap_filter_reinstate_brigade(f, ctx-buffered_bb, bb, flush_upto); } else if (APR_BRIGADE_EMPTY(bb)) { return APR_SUCCESS; } if (flush_upto != NULL) { ... } if (should_write) { ... } removing the empty variable danse and the duplicated send_brigade_nonblocking() code. +} + [...] + +bytes_in_brigade = 0; +non_file_bytes_in_brigade = 0; +eor_buckets_in_brigade = 0; +morphing_bucket_in_brigade = 0; + +for (bucket = APR_BRIGADE_FIRST(bb); bucket != APR_BRIGADE_SENTINEL(bb); + bucket = next) { +next = APR_BUCKET_NEXT(bucket); + [...] + +if (APR_BUCKET_IS_FLUSH(bucket) +|| non_file_bytes_in_brigade = THRESHOLD_MAX_BUFFER +|| morphing_bucket_in_brigade +|| eor_buckets_in_brigade MAX_REQUESTS_IN_PIPELINE) { +/* this segment of the brigade MUST be sent before returning. */ + [...] +/* + * Defer the actual blocking write to avoid doing many writes. + */ +if (flush_upto) { +*flush_upto = next; +} + +bytes_in_brigade = 0; +non_file_bytes_in_brigade = 0; +eor_buckets_in_brigade = 0; +morphing_bucket_in_brigade = 0; By resetting the counters here, and mainly bytes_in_brigade, the value returned by the function indicates whether we should write (nonblocking) the bytes *after* the flush_upto bucket, ie. not including the write (blocking) we should do for the bytes *before* this bucket. This is not clear in the description of the function (util_filter.h), particularly because flush_upto can be passed NULL. Does it make sense to call it without flush_upto? How would the caller know about the preceding bytes? +} +} + [...] + +return bytes_in_brigade = THRESHOLD_MIN_WRITE; +} + +AP_DECLARE(apr_status_t) ap_filter_should_yield(ap_filter_t *f) +{ Shouldn't this function return an int? +return f-c-data_in_output_filters; +} + Regards, Yann.
Re: [Patch] Async write completion for the full connection filter stack
On Thu, Sep 11, 2014 at 1:51 AM, Yann Ylavic ylavic@gmail.com wrote: On Wed, Sep 10, 2014 at 6:07 PM, Graham Leggett minf...@sharp.fm wrote: +AP_DECLARE(int) ap_filter_reinstate_brigade(ap_filter_t *f, +apr_bucket_brigade *buffered_bb, +apr_bucket_brigade *bb, +apr_bucket **flush_upto) +{ +apr_bucket *bucket, *next; +apr_size_t bytes_in_brigade, non_file_bytes_in_brigade; +int eor_buckets_in_brigade, morphing_bucket_in_brigade; +int loglevel = ap_get_conn_module_loglevel(f-c, APLOG_MODULE_INDEX); + +if ((buffered_bb != NULL) !APR_BRIGADE_EMPTY(buffered_bb)) { +APR_BRIGADE_PREPEND(bb, buffered_bb); +f-c-data_in_output_filters--; Maybe we could use instead : if (!APR_BRIGADE_EMPTY(bb)) { f-c-data_in_output_filters--; } APR_BRIGADE_PREPEND(bb, buffered_bb); so that it remains symetric with ap_filter_setaside_brigade() above (which increments c-data_in_output_filters). It also has the advantage to simplify the calling code in ap_core_output_filter(), where we could simply do : if ((ctx-buffered_bb != NULL) !APR_BRIGADE_EMPTY(ctx-buffered_bb)) { should_write = ap_filter_reinstate_brigade(f, ctx-buffered_bb, bb, flush_upto); } else if (APR_BRIGADE_EMPTY(bb)) { return APR_SUCCESS; } Or even simpler : should_write = ap_filter_reinstate_brigade(f, ctx-buffered_bb, bb, flush_upto); if (APR_BRIGADE_EMPTY(bb)) { return APR_SUCCESS; } if (flush_upto != NULL) { ... } if (should_write) { ... } removing the empty variable danse and the duplicated send_brigade_nonblocking() code. +}
Re: [Patch] Async write completion for the full connection filter stack
On 11 Sep 2014, at 1:51 AM, Yann Ylavic ylavic@gmail.com wrote: +{ +if (bb == NULL) { Is this test really needed? This does not seem to be possible with you patch, and now this is part of the API, a crash if often better in this case (like for any other ap_filter_*() function which is given a NULL bb, or the ap[r] ones in general which tend -sanely- to crash where the code is faulty). The test doesn’t seem to be needed no, that’s legacy. +else if (*deferred_write_pool) { +/* + * There are no more requests in the pipeline. We can just clear the + * pool. + */ Shouldn't *buffered_bb be set to NULL here when *deferred_write_pool == (*buffered_bb)-p, or more generally apr_pool_is_ancestor(*deferred_write_pool, (*buffered_bb)-p). We can't leave a dangling pointer in this case. +apr_pool_clear(*deferred_write_pool); Hmmm… this came from the original code. httpd-async-fullstack-ssl2.patch Description: Binary data We can’t set buffered_bb to NULL unless we are sure we created buffered_bb, and this isn’t necessarily the case. In the core filter, buffered_bb is created when the connection is created. +} +return APR_SUCCESS; +} + +AP_DECLARE(int) ap_filter_reinstate_brigade(ap_filter_t *f, +apr_bucket_brigade *buffered_bb, +apr_bucket_brigade *bb, +apr_bucket **flush_upto) +{ +apr_bucket *bucket, *next; +apr_size_t bytes_in_brigade, non_file_bytes_in_brigade; +int eor_buckets_in_brigade, morphing_bucket_in_brigade; +int loglevel = ap_get_conn_module_loglevel(f-c, APLOG_MODULE_INDEX); + +if ((buffered_bb != NULL) !APR_BRIGADE_EMPTY(buffered_bb)) { +APR_BRIGADE_PREPEND(bb, buffered_bb); +f-c-data_in_output_filters--; Maybe we could use instead : if (!APR_BRIGADE_EMPTY(bb)) { f-c-data_in_output_filters--; } APR_BRIGADE_PREPEND(bb, buffered_bb); so that it remains symetric with ap_filter_setaside_brigade() above (which increments c-data_in_output_filters). I don’t follow - that means something different as I’m reading it. We must only signal that we have less data in the output filters if we actually originally had data in the output filters, ie buffered_bb existed and wasn’t empty. It also has the advantage to simplify the calling code in ap_core_output_filter(), where we could simply do : if ((ctx-buffered_bb != NULL) !APR_BRIGADE_EMPTY(ctx-buffered_bb)) { should_write = ap_filter_reinstate_brigade(f, ctx-buffered_bb, bb, flush_upto); } else if (APR_BRIGADE_EMPTY(bb)) { return APR_SUCCESS; } if (flush_upto != NULL) { ... } if (should_write) { ... } removing the empty variable danse and the duplicated send_brigade_nonblocking() code. +} + [...] + +bytes_in_brigade = 0; +non_file_bytes_in_brigade = 0; +eor_buckets_in_brigade = 0; +morphing_bucket_in_brigade = 0; + +for (bucket = APR_BRIGADE_FIRST(bb); bucket != APR_BRIGADE_SENTINEL(bb); + bucket = next) { +next = APR_BUCKET_NEXT(bucket); + [...] + +if (APR_BUCKET_IS_FLUSH(bucket) +|| non_file_bytes_in_brigade = THRESHOLD_MAX_BUFFER +|| morphing_bucket_in_brigade +|| eor_buckets_in_brigade MAX_REQUESTS_IN_PIPELINE) { +/* this segment of the brigade MUST be sent before returning. */ + [...] +/* + * Defer the actual blocking write to avoid doing many writes. + */ +if (flush_upto) { +*flush_upto = next; +} + +bytes_in_brigade = 0; +non_file_bytes_in_brigade = 0; +eor_buckets_in_brigade = 0; +morphing_bucket_in_brigade = 0; By resetting the counters here, and mainly bytes_in_brigade, the value returned by the function indicates whether we should write (nonblocking) the bytes *after* the flush_upto bucket, ie. not including the write (blocking) we should do for the bytes *before* this bucket. This is not clear in the description of the function (util_filter.h), particularly because flush_upto can be passed NULL. Does it make sense to call it without flush_upto? How would the caller know about the preceding bytes? True. I’ve made flush_upto mandatory. +} +} + [...] + +return bytes_in_brigade = THRESHOLD_MIN_WRITE; +} + +AP_DECLARE(apr_status_t) ap_filter_should_yield(ap_filter_t *f) +{ Shouldn't this function return an int? It should. Fixed. Regards, Graham — httpd-async-fullstack-ssl2.patch Description: Binary data
RE: [Patch] Async write completion for the full connection filter stack
-Original Message- From: Jim Jagielski [mailto:j...@jagunet.com] Sent: Montag, 8. September 2014 21:31 To: dev@httpd.apache.org Subject: Re: [Patch] Async write completion for the full connection filter stack Another consideration: We now have the idea of a master and slave connection, and maybe something there would also help... FWIW: I like using an empty bucket conceptually since it should be ez and quick to check. Agreed, but I think from design perspective using the empty brigade is a side effect we assign to it that is not immediately jumping at your eyes especially if you are just developing modules. Thinking the below further we might need some kind of advisor API for the filters that tells how much data they should consume to avoid buffering too much and how much they can send down the chain without ending up in a blocking write. How much buffering is advised could be set by a configuration directive. Regards Rüdiger On Sep 8, 2014, at 2:53 PM, Ruediger Pluem rpl...@apache.org wrote: Wouldn't it make more sense instead of using an empty brigade to create yet another metabucket that signals write completion? It could also contain information how much data to send down the chain for single filters if they e.g. send heap or transient buckets. Otherwise how should they know? If you have a filter that has a large file bucket set aside and it does transform it e.g. to a heap bucket during it's processing because it changes data on it I guess it doesn't make sense if it does send all stuff once it gets triggered for write completion as we would end up in a blocking write then in the core filter. But if it knows how much is left in the core filter buffer it could try to just sent this and avoid thus blocking writes. And if there is no room left in the buffer or if what is left is too small for the filter to operate on it, the filter could just pass the bucket down the chain and if it would end up in the core output filter, the core output filter would just try to write what it has buffered. Regards Rüdiger Jim Jagielski wrote: Gotcha... +1 On Sep 8, 2014, at 11:29 AM, Graham Leggett minf...@sharp.fm wrote: On 08 Sep 2014, at 3:50 PM, Jim Jagielski j...@jagunet.com wrote: This is pretty cool... haven't played too much with it, but via inspection I like the implementation.
Re: [Patch] Async write completion for the full connection filter stack
On Mon, 2014-09-08 at 17:25 +0200, Graham Leggett wrote: Ideally, filters should do this, but generally they don’t: /* Do nothing if asked to filter nothing. */ if (APR_BRIGADE_EMPTY(bb)) { return ap_pass_brigade(f-next, bb); } Why on Earth should filters want to do that, as opposed to: Some filters, like mod_deflate, do this: /* Do nothing if asked to filter nothing. */ if (APR_BRIGADE_EMPTY(bb)) { return APR_SUCCESS; } or similar variants? In these cases ap_pass_brigade() is never called, so we detect this by keeping a marker that is changed on every call to ap_pass_brigade(). If the marker wasn’t changed during the call to the filter, we compensate by calling each downstream filter until the marker is changed, or we run out of filters. Yes. The logic is that we call ap_pass_brigade when there's something to pass. Not when there's nothing: that would just look like superfluous overhead. If you have a reason to propagate an immediate event regardless of that logic, surely that's the business of a FLUSH bucket. Then the question becomes, is it ever right to absorb (or buffer) and fail to propagate a FLUSH? You seem instead to be ascribing FLUSH semantics to an empty brigade! As a filter developer, it's my business to pass a brigade when: 1) I'm ready to pass data. 2) I encounter EOS, when I must finish up and propagate it. 3) I am explicitly signalled to FLUSH whatever I can. What am I missing? Do we have a need to refine the FLUSH bucket type? Maybe an EVENT bucket carrying an event descriptor? -- Nick Kew
Re: [Patch] Async write completion for the full connection filter stack
On 09 Sep 2014, at 10:58 AM, Nick Kew n...@webthing.com wrote: Ideally, filters should do this, but generally they don’t: /* Do nothing if asked to filter nothing. */ if (APR_BRIGADE_EMPTY(bb)) { return ap_pass_brigade(f-next, bb); } Why on Earth should filters want to do that, as opposed to: Some filters, like mod_deflate, do this: /* Do nothing if asked to filter nothing. */ if (APR_BRIGADE_EMPTY(bb)) { return APR_SUCCESS; } or similar variants? Because if they did, the compensation code in ap_pass_brigade() wouldn’t be necessary. In these cases ap_pass_brigade() is never called, so we detect this by keeping a marker that is changed on every call to ap_pass_brigade(). If the marker wasn’t changed during the call to the filter, we compensate by calling each downstream filter until the marker is changed, or we run out of filters. Yes. The logic is that we call ap_pass_brigade when there's something to pass. Not when there's nothing: that would just look like superfluous overhead. If you have a reason to propagate an immediate event regardless of that logic, surely that's the business of a FLUSH bucket. Then the question becomes, is it ever right to absorb (or buffer) and fail to propagate a FLUSH? You seem instead to be ascribing FLUSH semantics to an empty brigade! To be clear, an empty brigade does _not_ mean flush, not even slightly. Flush means “stop everything and perform this potentially expensive task to completion right now”, and is the exact opposite of what we’re trying to achieve. As a filter developer, it's my business to pass a brigade when: 1) I'm ready to pass data. 2) I encounter EOS, when I must finish up and propagate it. 3) I am explicitly signalled to FLUSH whatever I can. What am I missing? Do we have a need to refine the FLUSH bucket type? Maybe an EVENT bucket carrying an event descriptor? In a synchronous world where it doesn’t matter how long a unit of work takes, sure. In an async world you need to break up long running tasks into short running ones, so that others get a chance to have their data sent in the same thread, then this doesn’t work. Filters need to be able to yield and setaside data when they’re given too much data to process just like the core filter can - but right now they can’t, because that filter will never get called again, because upstream has no data to send. Regards, Graham —
Re: [Patch] Async write completion for the full connection filter stack
On 08 Sep 2014, at 8:53 PM, Ruediger Pluem rpl...@apache.org wrote: Wouldn't it make more sense instead of using an empty brigade to create yet another metabucket that signals write completion? It could also contain information how much data to send down the chain for single filters if they e.g. send heap or transient buckets. Otherwise how should they know? If you have a filter that has a large file bucket set aside and it does transform it e.g. to a heap bucket during it's processing because it changes data on it I guess it doesn't make sense if it does send all stuff once it gets triggered for write completion as we would end up in a blocking write then in the core filter. But if it knows how much is left in the core filter buffer it could try to just sent this and avoid thus blocking writes. And if there is no room left in the buffer or if what is left is too small for the filter to operate on it, the filter could just pass the bucket down the chain and if it would end up in the core output filter, the core output filter would just try to write what it has buffered. I spent a lot of time going down this path of having a dedicated metabucket, and quickly got bogged down in complexity. The key problem was “what does a filter actually do when you get one”, it was unclear and it made my head bleed. That makes life hard for module authors and that is bad. As I recall there were also broken filters out there that only knew about FLUSH and EOS buckets (eg ap_http_chunk_filter()). The problem we’re trying to solve is one of starvation - no filters can set aside data for later (except core via the NULL hack), because there is no guarantee that they’ll ever be called again later. You have to write it now, or potentially write it never. The start of the solution is ensure filters aren’t starved: if you have data in the output filters - and obviously you have no idea which filters have setaside data - you need a way to wake them all up. The simplest and least disruptive way is to pass them all an empty brigade, job done. We’ve got precedent for this - we’ve been sending NULL to the core filter to achieve the same thing, we want something that works with any filter. The second part of the problem is filters biting off more than they can chew. Example: give mod_ssl a 1GB file bucket and mod_ssl won’t yield until that entire 1GB file has been sent for the reason (now solved) above. The next step to enable write completion is to teach filters like mod_ssl to yield when handling large quantities of data. The core filter has an algorithm to yield, including various checks for flow control and sanity with respect to file handles. If a variant of this algorithm could be exposed generically and made available to critical filters like mod_ssl, we’ll crack write completion. Regards, Graham —
Re: [Patch] Async write completion for the full connection filter stack
This is pretty cool... haven't played too much with it, but via inspection I like the implementation. One question: === --- server/util_filter.c(revision 1622873) +++ server/util_filter.c(working copy) @@ -566,6 +566,16 @@ { if (next) { apr_bucket *e; +unsigned int activity; +int empty = APR_BRIGADE_EMPTY(bb); +apr_status_t status; ... +while (APR_SUCCESS == status empty) { +next = next-next; +if (next next-c-activity == activity) { +status = next-frec-filter_func.out_func(next, bb); +} +else { +break; +} +} + +return status; Why does while check for empty? Wouldn't it be faster to wrap that while in a 'if (empty)' ? It would safe an addition check on a var that doesn't change on each loop thru the while().
Re: [Patch] Async write completion for the full connection filter stack
On 08 Sep 2014, at 7:50 AM, Nick Kew n...@webthing.com wrote: +/* No problems found, and we were we sent an empty brigade, and + * did this empty brigade not get passed on by a filter to the next + * filter in the chain? Compensate by passing the empty brigade to + * the next filter, so every filter gets a turn to write. This + * occurs during write completion. + */ You mean this code can only ever be entered during write completion? In theory yes, but practically only when the brigade is empty and there is nothing to write. Else how does this play with a filter that legitimately/necessarily buffers everything - e.g. XSLT? Right now, that filter will carry on doing that. Or a brigade containing a single metadata bucket and no data, introduced by one filter as a signal to another? Then the brigade wouldn’t be empty. Basically what the code does is ensure that every filter in the chain gets called and therefore has an opportunity to write cached data, regardless of whether one filter has elected to not call apr_pass_brigade(). The filters are called in order, and no filters are missed. Ideally, filters should do this, but generally they don’t: /* Do nothing if asked to filter nothing. */ if (APR_BRIGADE_EMPTY(bb)) { return ap_pass_brigade(f-next, bb); } Some filters, like mod_deflate, do this: /* Do nothing if asked to filter nothing. */ if (APR_BRIGADE_EMPTY(bb)) { return APR_SUCCESS; } Others do the same thing by virtue of looping across the brigade bucket by bucket and then exiting - no buckets, we just have an exit. In these cases ap_pass_brigade() is never called, so we detect this by keeping a marker that is changed on every call to ap_pass_brigade(). If the marker wasn’t changed during the call to the filter, we compensate by calling each downstream filter until the marker is changed, or we run out of filters. One thing to think about is whether to to do this unconditionally - in other words send an empty brigade downstream when no downstream ap_pass_brigade() was detected regardless of whether the incoming brigade was empty or not. This covers the case where a buffering filter like the one you describe is reading in data from request N+1, while data in the core from request N is still being streamed over the network (and isn’t, because the core filter never gets called until buffering-filter is done). Request N isn’t delayed waiting for request N+1 to finish. To ensure that we don’t waste cycles we can do this when c-data_in_output_filters is non zero. We can also turn c-data_in_output_filters from a switch usable by the core filter only into a nested counter usable by any filter, so any filter can setaside buckets for any reason, and we’ll do the right thing. Patch attached. Regards, Graham — httpd-async-fullstack2.patch Description: Binary data
Re: [Patch] Async write completion for the full connection filter stack
On 08 Sep 2014, at 3:50 PM, Jim Jagielski j...@jagunet.com wrote: This is pretty cool... haven't played too much with it, but via inspection I like the implementation. One question: === --- server/util_filter.c (revision 1622873) +++ server/util_filter.c (working copy) @@ -566,6 +566,16 @@ { if (next) { apr_bucket *e; +unsigned int activity; +int empty = APR_BRIGADE_EMPTY(bb); +apr_status_t status; ... +while (APR_SUCCESS == status empty) { +next = next-next; +if (next next-c-activity == activity) { +status = next-frec-filter_func.out_func(next, bb); +} +else { +break; +} +} + +return status; Why does while check for empty? Wouldn't it be faster to wrap that while in a 'if (empty)' ? It would safe an addition check on a var that doesn't change on each loop thru the while(). It would - that was originally an if that became a while when I realised we had to compensate for multiple filters not calling ap_pass_brigade(), instead of just one. The new patch takes out empty completely and instead tracks c-data_in_output_filters, which could change during the loop. Regards, Graham —
Re: [Patch] Async write completion for the full connection filter stack
Gotcha... +1 On Sep 8, 2014, at 11:29 AM, Graham Leggett minf...@sharp.fm wrote: On 08 Sep 2014, at 3:50 PM, Jim Jagielski j...@jagunet.com wrote: This is pretty cool... haven't played too much with it, but via inspection I like the implementation. One question: === --- server/util_filter.c (revision 1622873) +++ server/util_filter.c (working copy) @@ -566,6 +566,16 @@ { if (next) { apr_bucket *e; +unsigned int activity; +int empty = APR_BRIGADE_EMPTY(bb); +apr_status_t status; ... +while (APR_SUCCESS == status empty) { +next = next-next; +if (next next-c-activity == activity) { +status = next-frec-filter_func.out_func(next, bb); +} +else { +break; +} +} + +return status; Why does while check for empty? Wouldn't it be faster to wrap that while in a 'if (empty)' ? It would safe an addition check on a var that doesn't change on each loop thru the while(). It would - that was originally an if that became a while when I realised we had to compensate for multiple filters not calling ap_pass_brigade(), instead of just one. The new patch takes out empty completely and instead tracks c-data_in_output_filters, which could change during the loop. Regards, Graham —
Re: [Patch] Async write completion for the full connection filter stack
Wouldn't it make more sense instead of using an empty brigade to create yet another metabucket that signals write completion? It could also contain information how much data to send down the chain for single filters if they e.g. send heap or transient buckets. Otherwise how should they know? If you have a filter that has a large file bucket set aside and it does transform it e.g. to a heap bucket during it's processing because it changes data on it I guess it doesn't make sense if it does send all stuff once it gets triggered for write completion as we would end up in a blocking write then in the core filter. But if it knows how much is left in the core filter buffer it could try to just sent this and avoid thus blocking writes. And if there is no room left in the buffer or if what is left is too small for the filter to operate on it, the filter could just pass the bucket down the chain and if it would end up in the core output filter, the core output filter would just try to write what it has buffered. Regards Rüdiger Jim Jagielski wrote: Gotcha... +1 On Sep 8, 2014, at 11:29 AM, Graham Leggett minf...@sharp.fm wrote: On 08 Sep 2014, at 3:50 PM, Jim Jagielski j...@jagunet.com wrote: This is pretty cool... haven't played too much with it, but via inspection I like the implementation.
Re: [Patch] Async write completion for the full connection filter stack
Another consideration: We now have the idea of a master and slave connection, and maybe something there would also help... FWIW: I like using an empty bucket conceptually since it should be ez and quick to check. On Sep 8, 2014, at 2:53 PM, Ruediger Pluem rpl...@apache.org wrote: Wouldn't it make more sense instead of using an empty brigade to create yet another metabucket that signals write completion? It could also contain information how much data to send down the chain for single filters if they e.g. send heap or transient buckets. Otherwise how should they know? If you have a filter that has a large file bucket set aside and it does transform it e.g. to a heap bucket during it's processing because it changes data on it I guess it doesn't make sense if it does send all stuff once it gets triggered for write completion as we would end up in a blocking write then in the core filter. But if it knows how much is left in the core filter buffer it could try to just sent this and avoid thus blocking writes. And if there is no room left in the buffer or if what is left is too small for the filter to operate on it, the filter could just pass the bucket down the chain and if it would end up in the core output filter, the core output filter would just try to write what it has buffered. Regards Rüdiger Jim Jagielski wrote: Gotcha... +1 On Sep 8, 2014, at 11:29 AM, Graham Leggett minf...@sharp.fm wrote: On 08 Sep 2014, at 3:50 PM, Jim Jagielski j...@jagunet.com wrote: This is pretty cool... haven't played too much with it, but via inspection I like the implementation.
Re: [Patch] Async write completion for the full connection filter stack
On Mon, 2014-09-08 at 03:31 +0200, Graham Leggett wrote: Hmm, maybe I'm just being thick 'cos I should be asleep at this hour, but ... +/* No problems found, and we were we sent an empty brigade, and + * did this empty brigade not get passed on by a filter to the next + * filter in the chain? Compensate by passing the empty brigade to + * the next filter, so every filter gets a turn to write. This + * occurs during write completion. + */ You mean this code can only ever be entered during write completion? Else how does this play with a filter that legitimately/necessarily buffers everything - e.g. XSLT? Or a brigade containing a single metadata bucket and no data, introduced by one filter as a signal to another? -- Nick Kew