RE: [Patch] Async write completion for the full connection filter stack

2014-09-09 Thread Plüm , Rüdiger , Vodafone Group


 -Original Message-
 From: Jim Jagielski [mailto:j...@jagunet.com]
 Sent: Montag, 8. September 2014 21:31
 To: dev@httpd.apache.org
 Subject: Re: [Patch] Async write completion for the full connection filter
 stack
 
 Another consideration: We now have the idea of a master
 and slave connection, and maybe something there would
 also help...
 
 FWIW: I like using an empty bucket conceptually since it should
 be ez and quick to check.

Agreed, but I think from design perspective using the empty brigade is a side 
effect
we assign to it that is not immediately jumping at your eyes especially if you 
are
just developing modules.
Thinking the below further we might need some kind of advisor API for the 
filters
that tells how much data they should consume to avoid buffering too much and 
how much they
can send down the chain without ending up in a blocking write.
How much buffering is advised could be set by a configuration directive.

Regards

Rüdiger

 On Sep 8, 2014, at 2:53 PM, Ruediger Pluem rpl...@apache.org wrote:
 
  Wouldn't it make more sense instead of using an empty brigade to create
 yet another metabucket that signals write
  completion? It could also contain information how much data to send down
 the chain for single filters if they e.g. send
  heap or transient buckets. Otherwise how should they know?
  If you have a filter that has a large file bucket set aside and it does
 transform it e.g. to a heap bucket during it's
  processing because it changes data on it I guess it doesn't make sense
 if it does send all stuff once it gets triggered
  for write completion as we would end up in a blocking write then in the
 core filter. But if it knows how much is left in
  the core filter buffer it could try to just sent this and avoid thus
 blocking writes. And if there is no room left in
  the buffer or if what is left is too small for the filter to operate on
 it, the filter could just pass the bucket down
  the chain and if it would end up in the core output filter, the core
 output filter would just try to write what it has
  buffered.
 
 
  Regards
 
  Rüdiger
 
  Jim Jagielski wrote:
  Gotcha... +1
  On Sep 8, 2014, at 11:29 AM, Graham Leggett minf...@sharp.fm wrote:
 
  On 08 Sep 2014, at 3:50 PM, Jim Jagielski j...@jagunet.com wrote:
 
  This is pretty cool... haven't played too much with it, but
  via inspection I like the implementation.
 
 



Re: [Patch] Async write completion for the full connection filter stack

2014-09-09 Thread Nick Kew
On Mon, 2014-09-08 at 17:25 +0200, Graham Leggett wrote:

 Ideally, filters should do this, but generally they don’t:
 
 /* Do nothing if asked to filter nothing. */
 if (APR_BRIGADE_EMPTY(bb)) {
 return ap_pass_brigade(f-next, bb);
 }

Why on Earth should filters want to do that, as opposed to:

 Some filters, like mod_deflate, do this:
 
 /* Do nothing if asked to filter nothing. */
 if (APR_BRIGADE_EMPTY(bb)) {
 return APR_SUCCESS;
 }

or similar variants?

 In these cases ap_pass_brigade() is never called, so we detect this by 
 keeping a marker that is changed on every call to ap_pass_brigade(). If the 
 marker wasn’t changed during the call to the filter, we compensate by calling 
 each downstream filter until the marker is changed, or we run out of filters.

Yes.  The logic is that we call ap_pass_brigade when there's
something to pass.  Not when there's nothing: that would just
look like superfluous overhead.

If you have a reason to propagate an immediate event regardless
of that logic, surely that's the business of a FLUSH bucket.
Then the question becomes, is it ever right to absorb (or buffer)
and fail to propagate a FLUSH?  You seem instead to be ascribing
FLUSH semantics to an empty brigade!

As a filter developer, it's my business to pass a brigade when:
 1) I'm ready to pass data.
 2) I encounter EOS, when I must finish up and propagate it.
 3) I am explicitly signalled to FLUSH whatever I can.
What am I missing?  Do we have a need to refine the FLUSH
bucket type?  Maybe an EVENT bucket carrying an event descriptor?

-- 
Nick Kew



Re: [Patch] Async write completion for the full connection filter stack

2014-09-09 Thread Graham Leggett

On 09 Sep 2014, at 10:58 AM, Nick Kew n...@webthing.com wrote:

 Ideally, filters should do this, but generally they don’t:
 
/* Do nothing if asked to filter nothing. */
if (APR_BRIGADE_EMPTY(bb)) {
return ap_pass_brigade(f-next, bb);
}
 
 Why on Earth should filters want to do that, as opposed to:
 
 Some filters, like mod_deflate, do this:
 
/* Do nothing if asked to filter nothing. */
if (APR_BRIGADE_EMPTY(bb)) {
return APR_SUCCESS;
}
 
 or similar variants?

Because if they did, the compensation code in ap_pass_brigade() wouldn’t be 
necessary.

 In these cases ap_pass_brigade() is never called, so we detect this by 
 keeping a marker that is changed on every call to ap_pass_brigade(). If the 
 marker wasn’t changed during the call to the filter, we compensate by 
 calling each downstream filter until the marker is changed, or we run out of 
 filters.
 
 Yes.  The logic is that we call ap_pass_brigade when there's
 something to pass.  Not when there's nothing: that would just
 look like superfluous overhead.
 
 If you have a reason to propagate an immediate event regardless
 of that logic, surely that's the business of a FLUSH bucket.
 Then the question becomes, is it ever right to absorb (or buffer)
 and fail to propagate a FLUSH?  You seem instead to be ascribing
 FLUSH semantics to an empty brigade!

To be clear, an empty brigade does _not_ mean flush, not even slightly.

Flush means “stop everything and perform this potentially expensive task to 
completion right now”, and is the exact opposite of what we’re trying to 
achieve.

 As a filter developer, it's my business to pass a brigade when:
 1) I'm ready to pass data.
 2) I encounter EOS, when I must finish up and propagate it.
 3) I am explicitly signalled to FLUSH whatever I can.
 What am I missing?  Do we have a need to refine the FLUSH
 bucket type?  Maybe an EVENT bucket carrying an event descriptor?

In a synchronous world where it doesn’t matter how long a unit of work takes, 
sure.

In an async world you need to break up long running tasks into short running 
ones, so that others get a chance to have their data sent in the same thread, 
then this doesn’t work. Filters need to be able to yield and setaside data when 
they’re given too much data to process just like the core filter can - but 
right now they can’t, because that filter will never get called again, because 
upstream has no data to send.

Regards,
Graham
—



Re: [Patch] Async write completion for the full connection filter stack

2014-09-09 Thread Graham Leggett
On 08 Sep 2014, at 8:53 PM, Ruediger Pluem rpl...@apache.org wrote:

 Wouldn't it make more sense instead of using an empty brigade to create yet 
 another metabucket that signals write
 completion? It could also contain information how much data to send down the 
 chain for single filters if they e.g. send
 heap or transient buckets. Otherwise how should they know?
 If you have a filter that has a large file bucket set aside and it does 
 transform it e.g. to a heap bucket during it's
 processing because it changes data on it I guess it doesn't make sense if it 
 does send all stuff once it gets triggered
 for write completion as we would end up in a blocking write then in the core 
 filter. But if it knows how much is left in
 the core filter buffer it could try to just sent this and avoid thus blocking 
 writes. And if there is no room left in
 the buffer or if what is left is too small for the filter to operate on it, 
 the filter could just pass the bucket down
 the chain and if it would end up in the core output filter, the core output 
 filter would just try to write what it has
 buffered.

I spent a lot of time going down this path of having a dedicated metabucket, 
and quickly got bogged down in complexity. The key problem was “what does a 
filter actually do when you get one”, it was unclear and it made my head bleed. 
That makes life hard for module authors and that is bad. As I recall there were 
also broken filters out there that only knew about FLUSH and EOS buckets (eg 
ap_http_chunk_filter()).

The problem we’re trying to solve is one of starvation - no filters can set 
aside data for later (except core via the NULL hack), because there is no 
guarantee that they’ll ever be called again later. You have to write it now, or 
potentially write it never. The start of the solution is ensure filters aren’t 
starved: if you have data in the output filters - and obviously you have no 
idea which filters have setaside data - you need a way to wake them all up. The 
simplest and least disruptive way is to pass them all an empty brigade, job 
done. We’ve got precedent for this - we’ve been sending NULL to the core filter 
to achieve the same thing, we want something that works with any filter.

The second part of the problem is filters biting off more than they can chew. 
Example: give mod_ssl a 1GB file bucket and mod_ssl won’t yield until that 
entire 1GB file has been sent for the reason (now solved) above. The next step 
to enable write completion is to teach filters like mod_ssl to yield when 
handling large quantities of data.

The core filter has an algorithm to yield, including various checks for flow 
control and sanity with respect to file handles. If a variant of this algorithm 
could be exposed generically and made available to critical filters like 
mod_ssl, we’ll crack write completion.

Regards,
Graham
—



AW: [Patch] Async write completion for the full connection filter stack

2014-09-09 Thread Plüm , Rüdiger , Vodafone Group


 -Ursprüngliche Nachricht-
 Von: Graham Leggett [mailto:minf...@sharp.fm]
 Gesendet: Dienstag, 9. September 2014 17:45
 An: dev@httpd.apache.org
 Betreff: Re: [Patch] Async write completion for the full connection
 filter stack
 
 On 08 Sep 2014, at 8:53 PM, Ruediger Pluem rpl...@apache.org wrote:
 
  Wouldn't it make more sense instead of using an empty brigade to
 create yet another metabucket that signals write
  completion? It could also contain information how much data to send
 down the chain for single filters if they e.g. send
  heap or transient buckets. Otherwise how should they know?
  If you have a filter that has a large file bucket set aside and it
 does transform it e.g. to a heap bucket during it's
  processing because it changes data on it I guess it doesn't make sense
 if it does send all stuff once it gets triggered
  for write completion as we would end up in a blocking write then in
 the core filter. But if it knows how much is left in
  the core filter buffer it could try to just sent this and avoid thus
 blocking writes. And if there is no room left in
  the buffer or if what is left is too small for the filter to operate
 on it, the filter could just pass the bucket down
  the chain and if it would end up in the core output filter, the core
 output filter would just try to write what it has
  buffered.
 
 I spent a lot of time going down this path of having a dedicated
 metabucket, and quickly got bogged down in complexity. The key problem
 was what does a filter actually do when you get one, it was unclear

Don't we have the same problem with an empty brigade? Some filters are not going
to handle it as we expect. Hence the additional logic in ap_pass_brigade.
I guess the minimum behavior we need to get from every filter is to ignore and 
pass on.

 and it made my head bleed. That makes life hard for module authors and
 that is bad. As I recall there were also broken filters out there that
 only knew about FLUSH and EOS buckets (eg ap_http_chunk_filter()).

We already have additional metabuckets like error buckets or EOR.
So I don't see an issue creating a new one. Any filter not passing a meta bucket
that is does not understand or even try to process it is simply broken. 

 
 The problem we're trying to solve is one of starvation - no filters can
 set aside data for later (except core via the NULL hack), because there
 is no guarantee that they'll ever be called again later. You have to
 write it now, or potentially write it never. The start of the solution
 is ensure filters aren't starved: if you have data in the output filters
 - and obviously you have no idea which filters have setaside data - you
 need a way to wake them all up. The simplest and least disruptive way is
 to pass them all an empty brigade, job done. We've got precedent for
 this - we've been sending NULL to the core filter to achieve the same

But this is *our* filter and it will not hit any custom filters. So we can
do this kind of hacky game here.

 thing, we want something that works with any filter.

Yes, and this is the reason why I still believe a meta bucket is better.

 
 The second part of the problem is filters biting off more than they can
 chew. Example: give mod_ssl a 1GB file bucket and mod_ssl won't yield
 until that entire 1GB file has been sent for the reason (now solved)
 above. The next step to enable write completion is to teach filters like
 mod_ssl to yield when handling large quantities of data.
 
 The core filter has an algorithm to yield, including various checks for
 flow control and sanity with respect to file handles. If a variant of
 this algorithm could be exposed generically and made available to
 critical filters like mod_ssl, we'll crack write completion.

See my other post. I proposed some kind of advisor API that tells a filter
how much it should write to avoid buffering too much and consuming too much
memory and how much it could write to likely avoid a blocking write.
As this will not be always accurate I call it advisor API.

Regards

Rüdiger