Re: [Patch] non blocking writes in core

Graham Leggett Thu, 21 Nov 2013 10:38:08 -0800

On 21 Nov 2013, at 10:43 AM, Joe Orton <[email protected]> wrote:

> Those rules are written (explicitly) for resource-level filters.  They 
> would have to be a little different for CONNECTION level, e.g. EOS 
> handling should probably be different... though I'm not sure how we'd 
> write the rule.  Probably connection filters should delete or pass EOS, 
> since they don't (shouldn't) care about request boundaries.  Dunno.
> 
> mod_ssl's output filter goes to some effort to apply (10) consistently, 
> and is agnostic to data bucket types (5), but yeah, the metadata 
> handling is looks wrong as you say.


Having looked at this is a little deeper, having implemented the initial pass 
at nonblocking support for the core, and then trying to implement how mod_ssl 
might use this, it has uncovered another snag - filters right now assume that 
APR_EAGAIN is a fatal error and leave immediately, when in reality we might be 
half way through transforming and writing a brigade.

Right now, it seems that filters have the property that they are obliged to 
consume the brigade they have been given. What this means practically is that 
if mod_ssl is given a 4GB file bucket, it is obliged to consume the entire 4GB 
file bucket and ensure it is written downstream before returning, and that 
breaks write completion. Ideally, the core filter should be able to return 
APR_EAGAIN and mod_ssl should "do the right thing" to set aside the remainder 
before passing back the "error" to the caller, but it doesn't do that, and 
neither does mod_deflate and a host of other external filters.

Ideally we need a new type of output filter, an asynchronous output filter, 
which in addition to the ten rules have these rules:

- Empty brigades must be passed upstream and not ignored like now. This is so 
that an upstream filter with setaside data has the chance to keep writing it, 
without messing about with metadata buckets or other performance sapping stuff.

- Async filters must understand and react to APR_EAGAIN from upstream, 
APR_EAGAIN is not an error, instead it is a request to call us again when we 
are writable, possibly with an empty brigade. We must setaside what we have so 
far unwritten (probably using functionality buried today in the core output 
filter, suitably reference counting the c->data_in_output_filters).

- The core output filter becomes non blocking by default, if we want blocking 
behaviour we must explicitly send a flush bucket. While it would be nice to 
pass a "block/nonblock" flag like we do with the input filter stack this 
competes with the flush buckets, which effectively mean "block please". 
Defaulting to "nonblock until explicitly told otherwise" is nice and simple.

Given that the world must be given an opportunity to migrate to async filters, 
we would need to support both at the same time.

What I am thinking is that a filter registers itself either as a legacy 
synchronous filter (you'll be given a brigade, don't return until all the 
brigade is consumed), or an asynchronous filter (feel free to return APR_EAGAIN 
at any time, and setaside data if you receive APR_EAGAIN from upstream), you 
state explicitly which.

The key bit will be the transition from an upstream async filter to an earlier 
sync filter, APR_EAGAIN will need to be trapped and responded to by sending a 
flush bucket up the stack, effectively allowing us to convert APR_EAGAIN into 
APR_SUCCESS which can then be safely received by the earlier sync filter. This 
is a simple task that could be accomplished inside ap_pass_brigade().

The event MPM would simply need to pass empty brigades to c->output_filters 
until c->data_in_output_filters returns to zero.

The addition of a sync filter to the c->output_filters would cause 
write_completion to stop working, but it will still be functional until the 
sync filter became async.

We currently have this situation where we keep track of the start of the 
protocol filter stack in c->output_filters instead of the real start of the 
stack, we could for now insert a helper protocol filter that runs first and 
waits until EOS is received meaning "synchronous processing is done", before 
setting aside the data and returning APR_SUCCESS, inviting event to enter write 
completion in the hope that everything from this filter onwards is async and 
therefore eligible for write completion. This filter can also be responsible 
for flow control, vastly simplifying the core_output_filter we have now.

All of this is still backportable to v2.4.

Regards,
Graham
--

Re: [Patch] non blocking writes in core

Reply via email to