RE: [Patch] non blocking writes in core

Plüm , Rüdiger , Vodafone Group Thu, 21 Nov 2013 13:04:32 -0800


> -----Original Message-----
> From: Graham Leggett [mailto:minf...@sharp.fm]
> Sent: Donnerstag, 21. November 2013 14:58
> To: dev@httpd.apache.org
> Subject: Re: [Patch] non blocking writes in core
> 
> On 21 Nov 2013, at 10:43 AM, Joe Orton <jor...@redhat.com> wrote:
> 
> > Those rules are written (explicitly) for resource-level filters.  They
> > would have to be a little different for CONNECTION level, e.g. EOS
> > handling should probably be different... though I'm not sure how we'd
> > write the rule.  Probably connection filters should delete or pass EOS,
> > since they don't (shouldn't) care about request boundaries.  Dunno.
> >
> > mod_ssl's output filter goes to some effort to apply (10) consistently,
> > and is agnostic to data bucket types (5), but yeah, the metadata
> > handling is looks wrong as you say.
> 
> Having looked at this is a little deeper, having implemented the initial
> pass at nonblocking support for the core, and then trying to implement how
> mod_ssl might use this, it has uncovered another snag - filters right now
> assume that APR_EAGAIN is a fatal error and leave immediately, when in
> reality we might be half way through transforming and writing a brigade.
> 
> Right now, it seems that filters have the property that they are obliged
> to consume the brigade they have been given. What this means practically
> is that if mod_ssl is given a 4GB file bucket, it is obliged to consume
> the entire 4GB file bucket and ensure it is written downstream before
> returning, and that breaks write completion. Ideally, the core filter
> should be able to return APR_EAGAIN and mod_ssl should "do the right
> thing" to set aside the remainder before passing back the "error" to the
> caller, but it doesn't do that, and neither does mod_deflate and a host of
> other external filters.
> 
> Ideally we need a new type of output filter, an asynchronous output
> filter, which in addition to the ten rules have these rules:
> 
> - Empty brigades must be passed upstream and not ignored like now. This is
> so that an upstream filter with setaside data has the chance to keep
> writing it, without messing about with metadata buckets or other
> performance sapping stuff.
> 
> - Async filters must understand and react to APR_EAGAIN from upstream,
> APR_EAGAIN is not an error, instead it is a request to call us again when
> we are writable, possibly with an empty brigade. We must setaside what we
> have so far unwritten (probably using functionality buried today in the
> core output filter, suitably reference counting the c-
> >data_in_output_filters).
> 
> - The core output filter becomes non blocking by default, if we want
> blocking behaviour we must explicitly send a flush bucket. While it would
> be nice to pass a "block/nonblock" flag like we do with the input filter
> stack this competes with the flush buckets, which effectively mean "block
> please". Defaulting to "nonblock until explicitly told otherwise" is nice
> and simple.
> 
> Given that the world must be given an opportunity to migrate to async
> filters, we would need to support both at the same time.
> 
> What I am thinking is that a filter registers itself either as a legacy
> synchronous filter (you'll be given a brigade, don't return until all the
> brigade is consumed), or an asynchronous filter (feel free to return
> APR_EAGAIN at any time, and setaside data if you receive APR_EAGAIN from
> upstream), you state explicitly which.
> 
> The key bit will be the transition from an upstream async filter to an
> earlier sync filter, APR_EAGAIN will need to be trapped and responded to
> by sending a flush bucket up the stack, effectively allowing us to convert
> APR_EAGAIN into APR_SUCCESS which can then be safely received by the
> earlier sync filter. This is a simple task that could be accomplished
> inside ap_pass_brigade().
> 
> The event MPM would simply need to pass empty brigades to c-
> >output_filters until c->data_in_output_filters returns to zero.
> 
> The addition of a sync filter to the c->output_filters would cause
> write_completion to stop working, but it will still be functional until
> the sync filter became async.
> 
> We currently have this situation where we keep track of the start of the
> protocol filter stack in c->output_filters instead of the real start of
> the stack, we could for now insert a helper protocol filter that runs
> first and waits until EOS is received meaning "synchronous processing is


If this filter sets aside all stuff until eos (or a flush bucket) is seen it 
might
cause a huge amount of memory consumption. So we possibly need some limit for 
memory consuming
buckets which when passed causes us to send the stuff we have down the chain 
even without an eos
or flush bucket. As far as I know the current core output filter also only 
buffers a limited
amount of "memory" buckets until it decides to go for blocking write.

Otherwise sounds sensible.

Regards

Rüdiger

RE: [Patch] non blocking writes in core

Reply via email to