Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-04-01 Thread Ruediger Pluem



On 4/1/20 11:02 AM, Joe Orton wrote:
> On Wed, Apr 01, 2020 at 10:06:20AM +0200, Ruediger Pluem wrote:
>>
>>
>> On 4/1/20 9:53 AM, Joe Orton wrote:
>>> On Wed, Apr 01, 2020 at 09:24:27AM +0200, Ruediger Pluem wrote:
 I have checked socket, pipe and cgi buckets and none of them seem to 
 return EOF. In case they hit EOF they replace themselves with
 an immortal bucket of length 0. So above seems just an additional safety 
 if a (future) morphing bucket behaves differently and
 returns EOF, but with the current code that path should not be hit really.
>>>
>>> Yeah, the "read till EOF" is an implementation detail for those bucket 
>>> types, I would very strongly argue if they ever exposed EOF on read() it 
>>> would be a bucket type bug.  It could quite possibly obscure a bug 
>>> elsewhere if filters ignored EOF on read() so I don't think that should 
>>> be recommended.
>>
>> So you recommend that part of the patch to be reverted?
> 
> Thinking about it more, the most likely way to hit that condition is a 
> FILE bucket where the underlying file is truncated simultaneous to it 
> being read (i.e. it's shorter than when the bucket was created).  That 
> will definitely result in apr_bucket_read() returning EOF, and I think 
> should definitely be treated as an error rather than silently ignored.
> 
> TL;DR -> yes, or am I missing something?
> 

I think you are correct. This should be an error as it can only happen in error 
cases.

Regards

Rüdiger


Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-04-01 Thread Joe Orton
On Wed, Apr 01, 2020 at 10:06:20AM +0200, Ruediger Pluem wrote:
> 
> 
> On 4/1/20 9:53 AM, Joe Orton wrote:
> > On Wed, Apr 01, 2020 at 09:24:27AM +0200, Ruediger Pluem wrote:
> >> I have checked socket, pipe and cgi buckets and none of them seem to 
> >> return EOF. In case they hit EOF they replace themselves with
> >> an immortal bucket of length 0. So above seems just an additional safety 
> >> if a (future) morphing bucket behaves differently and
> >> returns EOF, but with the current code that path should not be hit really.
> > 
> > Yeah, the "read till EOF" is an implementation detail for those bucket 
> > types, I would very strongly argue if they ever exposed EOF on read() it 
> > would be a bucket type bug.  It could quite possibly obscure a bug 
> > elsewhere if filters ignored EOF on read() so I don't think that should 
> > be recommended.
> 
> So you recommend that part of the patch to be reverted?

Thinking about it more, the most likely way to hit that condition is a 
FILE bucket where the underlying file is truncated simultaneous to it 
being read (i.e. it's shorter than when the bucket was created).  That 
will definitely result in apr_bucket_read() returning EOF, and I think 
should definitely be treated as an error rather than silently ignored.

TL;DR -> yes, or am I missing something?



Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-04-01 Thread Ruediger Pluem



On 4/1/20 9:53 AM, Joe Orton wrote:
> On Wed, Apr 01, 2020 at 09:24:27AM +0200, Ruediger Pluem wrote:
>> I have checked socket, pipe and cgi buckets and none of them seem to return 
>> EOF. In case they hit EOF they replace themselves with
>> an immortal bucket of length 0. So above seems just an additional safety if 
>> a (future) morphing bucket behaves differently and
>> returns EOF, but with the current code that path should not be hit really.
> 
> Yeah, the "read till EOF" is an implementation detail for those bucket 
> types, I would very strongly argue if they ever exposed EOF on read() it 
> would be a bucket type bug.  It could quite possibly obscure a bug 
> elsewhere if filters ignored EOF on read() so I don't think that should 
> be recommended.

So you recommend that part of the patch to be reverted?

Regards

Rüdiger


Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-04-01 Thread Joe Orton
On Wed, Apr 01, 2020 at 09:24:27AM +0200, Ruediger Pluem wrote:
> I have checked socket, pipe and cgi buckets and none of them seem to return 
> EOF. In case they hit EOF they replace themselves with
> an immortal bucket of length 0. So above seems just an additional safety if a 
> (future) morphing bucket behaves differently and
> returns EOF, but with the current code that path should not be hit really.

Yeah, the "read till EOF" is an implementation detail for those bucket 
types, I would very strongly argue if they ever exposed EOF on read() it 
would be a bucket type bug.  It could quite possibly obscure a bug 
elsewhere if filters ignored EOF on read() so I don't think that should 
be recommended.



Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-04-01 Thread Ruediger Pluem



On 3/31/20 6:52 PM, Yann Ylavic wrote:
> On Tue, Mar 31, 2020 at 7:35 AM Ruediger Pluem  wrote:
>>
>> On 3/31/20 1:19 AM, Yann Ylavic wrote:

>>
>> An idea would be a to have a filter that comes after all request filters 
>> (either as very first connection filter or in the same
>> position as the the ap_request_core_filter and as soon as it sees an EOS 
>> bucket it adds an EOR bucket afterwards.
>> For sync MPM's we would need an option to make ap_run_output_pending flush 
>> all data at least on request filter level.
> 
> This would potentially segfault a lot of output filters / handlers
> IMHO, what if we add EOR just after EOS, pass the whole to the core
> output filter which fails to write to the socket (ECONNRESET). The EOR
> gets destroyed, we return the error to the request filters/handlers
> which then shouldn't touch anything r->pool allocated. This includes
> the bb they used to pass their buckets for instance, so the usual
> ap_pass_brigade(r->output_filters, bb) + apr_brigade_cleanup(bb) would
> simply crash if bb = apr_brigade_create(r->pool, c->bucket_alloc)
> initially.

Good point. I missed the unwinding of the call stack after EOR was inserted.

Regards

Rüdiger



Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-04-01 Thread Ruediger Pluem



On 3/31/20 6:31 PM, Yann Ylavic wrote:
> On Tue, Mar 31, 2020 at 7:11 AM Ruediger Pluem  wrote:
>>
>> On 3/31/20 1:19 AM, Yann Ylavic wrote:
>>> Index: server/core_filters.c
>>> ===
>>> --- server/core_filters.c (revision 1875881)
>>> +++ server/core_filters.c (working copy)
>>> @@ -543,6 +543,12 @@ static apr_status_t send_brigade_nonblocking(apr_s
>>>
>>>  rv = apr_bucket_read(bucket, , , 
>>> APR_BLOCK_READ);
>>>  }
>>> +if (APR_STATUS_IS_EOF(rv)) {
>>> +/* Morphing bucket exhausted, next. */
>>> +apr_bucket_delete(bucket);
>>> +rv = APR_SUCCESS;
>>> +continue;
>>> +}
>>>  if (rv != APR_SUCCESS) {
>>>  goto cleanup;
>>>  }
>>
>>
>> How is the above related to the issue here? I guess this is something 
>> probably all callers to apr_bucket_read need to take care,
>> of, correct?
> 
> Since the core output filter can now have to handle morphing buckets
> (not retained by ap_request_core_filter() anymore), I thought it was
> related..
> Agreed that all apr_bucket_read() users should do that.

I have checked socket, pipe and cgi buckets and none of them seem to return 
EOF. In case they hit EOF they replace themselves with
an immortal bucket of length 0. So above seems just an additional safety if a 
(future) morphing bucket behaves differently and
returns EOF, but with the current code that path should not be hit really.

Regards

Rüdiger


Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-03-31 Thread Yann Ylavic
On Tue, Mar 31, 2020 at 7:35 AM Ruediger Pluem  wrote:
>
> On 3/31/20 1:19 AM, Yann Ylavic wrote:
> >
> > Yes that's the issue from the start, request filters are not supposed
> > to retain anything after the EOS bucket, they are simply not called
> > anymore once they return.
>
> Although this is not needed to fix the issue that caused the thread it is 
> IMHO a pity. This means we fix the case of a large file
> / morphing bucket causing blocking writes over an SSL connection, but the 
> same thing through a gzip / mod_substitute /
> mod_proxy_html filter still causes the blocking writes. But probably one step 
> after another.

Yes, if we don't want to block we need the handler to handle
should_yield or EAGAIN or whatever, SUSPEND (or register an MPM
callback) and be prepared to be called again. I don't see how we could
do it automagically in the core for the handler.

> >
> >> We would need to ensure that EOR is sent after EOS. But I guess we cannot 
> >> sent EOR through the remaining request filters as most
> >> of them ignore and swallow what they see after EOS.
> >
> > No, too dangerous IMHO. If we pass the EOR to request filters they
> > need to be really cautious about what do with it, and even more after
> > they pass it (don't control it anymore). Anything they have allocated
> > on r->pool (temporary brigade or whatever) can't be used anymore once
> > the EOR is destroyed.
> >
> > The EOR has to be sent to the connection filters, when we know that
> > the request is finished, and before we start sending anything for the
> > next request. That's ap_process_request_after_handler() currently and
> > it looks like the right place to me.
>
> An idea would be a to have a filter that comes after all request filters 
> (either as very first connection filter or in the same
> position as the the ap_request_core_filter and as soon as it sees an EOS 
> bucket it adds an EOR bucket afterwards.
> For sync MPM's we would need an option to make ap_run_output_pending flush 
> all data at least on request filter level.

This would potentially segfault a lot of output filters / handlers
IMHO, what if we add EOR just after EOS, pass the whole to the core
output filter which fails to write to the socket (ECONNRESET). The EOR
gets destroyed, we return the error to the request filters/handlers
which then shouldn't touch anything r->pool allocated. This includes
the bb they used to pass their buckets for instance, so the usual
ap_pass_brigade(r->output_filters, bb) + apr_brigade_cleanup(bb) would
simply crash if bb = apr_brigade_create(r->pool, c->bucket_alloc)
initially.

>
> >
> >
> > FWIW, attached is v2 with simpler "batching" in
> > ap_filter_setaside_brigade(), no spurious hunk, bb cleaned up at the
> > end, and APR_ENOTIMPL when called from a filter < AP_FTYPE_CONNECTION.
>
> Looks good to me. I understood that you verified that it fixes the issue and 
> I see no obvious downsides right now.
> Let's go with it and improve (feature wise) from there.

Thanks, r1875947.


Regards,
Yann.


Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-03-31 Thread Yann Ylavic
On Tue, Mar 31, 2020 at 7:11 AM Ruediger Pluem  wrote:
>
> On 3/31/20 1:19 AM, Yann Ylavic wrote:
> > Index: server/core_filters.c
> > ===
> > --- server/core_filters.c (revision 1875881)
> > +++ server/core_filters.c (working copy)
> > @@ -543,6 +543,12 @@ static apr_status_t send_brigade_nonblocking(apr_s
> >
> >  rv = apr_bucket_read(bucket, , , 
> > APR_BLOCK_READ);
> >  }
> > +if (APR_STATUS_IS_EOF(rv)) {
> > +/* Morphing bucket exhausted, next. */
> > +apr_bucket_delete(bucket);
> > +rv = APR_SUCCESS;
> > +continue;
> > +}
> >  if (rv != APR_SUCCESS) {
> >  goto cleanup;
> >  }
>
>
> How is the above related to the issue here? I guess this is something 
> probably all callers to apr_bucket_read need to take care
> of, correct?

Since the core output filter can now have to handle morphing buckets
(not retained by ap_request_core_filter() anymore), I thought it was
related..
Agreed that all apr_bucket_read() users should do that.

Regards,
Yann.


Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-03-31 Thread Joe Orton
On Tue, Mar 31, 2020 at 01:19:08AM +0200, Yann Ylavic wrote:
> FWIW, attached is v2 with simpler "batching" in
> ap_filter_setaside_brigade(), no spurious hunk, bb cleaned up at the
> end, and APR_ENOTIMPL when called from a filter < AP_FTYPE_CONNECTION.

FWIW this makes more sense to me than what's in trunk, so +1 here too.




Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-03-30 Thread Ruediger Pluem



On 3/31/20 1:19 AM, Yann Ylavic wrote:
> On Mon, Mar 30, 2020 at 10:23 PM Ruediger Pluem  wrote:
>>
>> On 3/26/20 2:00 PM, Yann Ylavic wrote:
>>>

> 
> []
>>> +ap_pass_brigade(c->output_filters, bb);
>>
>> But if request filters still have setaside brigades and these would be 
>> reactivated by ap_run_output_pending(c); in the
>> WRITE_COMPLETION phase they would write their response content after the EOR 
>> bucket to the filter chain, which actually mean that
>> the EOR bucket is in the middle of the response data.
> 
> Yes that's the issue from the start, request filters are not supposed
> to retain anything after the EOS bucket, they are simply not called
> anymore once they return.

Although this is not needed to fix the issue that caused the thread it is IMHO 
a pity. This means we fix the case of a large file
/ morphing bucket causing blocking writes over an SSL connection, but the same 
thing through a gzip / mod_substitute /
mod_proxy_html filter still causes the blocking writes. But probably one step 
after another.

> 
>> We would need to ensure that EOR is sent after EOS. But I guess we cannot 
>> sent EOR through the remaining request filters as most
>> of them ignore and swallow what they see after EOS.
> 
> No, too dangerous IMHO. If we pass the EOR to request filters they
> need to be really cautious about what do with it, and even more after
> they pass it (don't control it anymore). Anything they have allocated
> on r->pool (temporary brigade or whatever) can't be used anymore once
> the EOR is destroyed.
> 
> The EOR has to be sent to the connection filters, when we know that
> the request is finished, and before we start sending anything for the
> next request. That's ap_process_request_after_handler() currently and
> it looks like the right place to me.

An idea would be a to have a filter that comes after all request filters 
(either as very first connection filter or in the same
position as the the ap_request_core_filter and as soon as it sees an EOS bucket 
it adds an EOR bucket afterwards.
For sync MPM's we would need an option to make ap_run_output_pending flush all 
data at least on request filter level.

> 
> 
> FWIW, attached is v2 with simpler "batching" in
> ap_filter_setaside_brigade(), no spurious hunk, bb cleaned up at the
> end, and APR_ENOTIMPL when called from a filter < AP_FTYPE_CONNECTION.

Looks good to me. I understood that you verified that it fixes the issue and I 
see no obvious downsides right now.
Let's go with it and improve (feature wise) from there.

Regards

Rüdiger



Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-03-30 Thread Ruediger Pluem



On 3/31/20 1:19 AM, Yann Ylavic wrote:
> Index: server/core_filters.c
> ===
> --- server/core_filters.c (revision 1875881)
> +++ server/core_filters.c (working copy)
> @@ -543,6 +543,12 @@ static apr_status_t send_brigade_nonblocking(apr_s
>  
>  rv = apr_bucket_read(bucket, , , APR_BLOCK_READ);
>  }
> +if (APR_STATUS_IS_EOF(rv)) {
> +/* Morphing bucket exhausted, next. */
> +apr_bucket_delete(bucket);
> +rv = APR_SUCCESS;
> +continue;
> +}
>  if (rv != APR_SUCCESS) {
>  goto cleanup;
>  }


How is the above related to the issue here? I guess this is something probably 
all callers to apr_bucket_read need to take care
of, correct?

Regards

Rüdiger


Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-03-30 Thread Yann Ylavic
On Mon, Mar 30, 2020 at 10:23 PM Ruediger Pluem  wrote:
>
> On 3/26/20 2:00 PM, Yann Ylavic wrote:
>>
>> Index: server/util_filter.c
>> ===
>> --- server/util_filter.c(revision 1875498)
>> +++ server/util_filter.c(working copy)
>> @@ -945,34 +956,56 @@ AP_DECLARE(apr_status_t) ap_filter_setaside_brigad
>>(!fp->bb || APR_BRIGADE_EMPTY(fp->bb)) ? "empty" : "full",
>>f->frec->name);
>>
>> +/* Buckets in fp->bb are leftover from previous call to reinstate, so
>> + * they happen after the buckets (re)set aside here.
>> + */
>> +if (fp->bb) {
>> +APR_BRIGADE_CONCAT(bb, fp->bb);
>> +}
>> +
>
> How could it happend that fp-bb is not empty here?

Yes correct, this is a hunk from another (unsuccessful) patch which
had ap_filter_reinstate_brigade() do the apr_bucket_read()s on
morphing buckets up to the limits, and retaining the rest. This rest
had to be restored on the further call to ap_filter_setaside_brigade()
then.
Nevermind, it's irrelevant to this patch.

>>
>> +
>> +if (tmp_bb) {
>> +/* Save any remainder. */
>> +if (!APR_BRIGADE_EMPTY(tmp_bb)) {
>> +rv = save_aside_brigade(fp, tmp_bb);
>> +APR_BRIGADE_PREPEND(bb, tmp_bb);
>
> In which case is this needed? tmp_bb should be always empty unless 
> ap_save_brigade fails.

Correct, in which case it sets back unsaved buckets in bb. I wondered
if we should cleanup bb (unsaved buckets) before leaving, but if some
buckets can't be saved here they are likely new ones (those which have
already been saved once shouldn't fail), so this PREPEND allows to
give them back to the caller. It's probably safer to clean them up,
though.

>
> +}
> +ap_release_brigade(f->c, tmp_bb);
>  }
> -
>  }

>
> I think independent of the patch the API is not safe for use by request 
> filters

Yes I agree, ap_filter_reinstate/setaside_brigade() shouldn't be used
in filters < AP_FTYPE_CONNECTION, that's why this patch removes
ap_request_core_filter(). But these helpers are useful for
ap_core_output_filter() and/or ssl_io_filter_output(), i.e. the ones
that face the "flood" from the handler and need to block at the
limits.

> because once we return from the handler we insert
> the EOR bucket at the level of connection filters (c->output_filters) or 
> before the patch at the level of the request_core_filter:

It was originally sent to c->output_filters (like in 2.4.x), but a
previous commit of mine tried to send it to ap_request_core_filter(),
for bad reasons. This patch is restoring previous behaviour in this
regard.

[]
>> +ap_pass_brigade(c->output_filters, bb);
>
> But if request filters still have setaside brigades and these would be 
> reactivated by ap_run_output_pending(c); in the
> WRITE_COMPLETION phase they would write their response content after the EOR 
> bucket to the filter chain, which actually mean that
> the EOR bucket is in the middle of the response data.

Yes that's the issue from the start, request filters are not supposed
to retain anything after the EOS bucket, they are simply not called
anymore once they return.

> We would need to ensure that EOR is sent after EOS. But I guess we cannot 
> sent EOR through the remaining request filters as most
> of them ignore and swallow what they see after EOS.

No, too dangerous IMHO. If we pass the EOR to request filters they
need to be really cautious about what do with it, and even more after
they pass it (don't control it anymore). Anything they have allocated
on r->pool (temporary brigade or whatever) can't be used anymore once
the EOR is destroyed.

The EOR has to be sent to the connection filters, when we know that
the request is finished, and before we start sending anything for the
next request. That's ap_process_request_after_handler() currently and
it looks like the right place to me.


FWIW, attached is v2 with simpler "batching" in
ap_filter_setaside_brigade(), no spurious hunk, bb cleaned up at the
end, and APR_ENOTIMPL when called from a filter < AP_FTYPE_CONNECTION.

Regards,
Yann.
Index: server/core.c
===
--- server/core.c	(revision 1875881)
+++ server/core.c	(working copy)
@@ -122,7 +122,6 @@ AP_IMPLEMENT_HOOK_RUN_FIRST(apr_status_t, insert_n
 
 /* Handles for core filters */
 AP_DECLARE_DATA ap_filter_rec_t *ap_subreq_core_filter_handle;
-AP_DECLARE_DATA ap_filter_rec_t *ap_request_core_filter_handle;
 AP_DECLARE_DATA ap_filter_rec_t *ap_core_output_filter_handle;
 AP_DECLARE_DATA ap_filter_rec_t *ap_content_length_filter_handle;
 AP_DECLARE_DATA ap_filter_rec_t *ap_core_input_filter_handle;
@@ -5916,9 +5927,6 @@ static void register_hooks(apr_pool_t *p)
 ap_core_output_filter_handle =
 ap_register_output_filter("CORE", ap_core_output_filter,
   NULL, 

Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-03-30 Thread Ruediger Pluem



On 3/26/20 2:00 PM, Yann Ylavic wrote:

> 
> Wouldn't that work (patch attached, passes test framework and Joe's repro)?
> 
> 


Index: server/util_filter.c
===
--- server/util_filter.c(revision 1875498)
+++ server/util_filter.c(working copy)
@@ -945,34 +956,56 @@ AP_DECLARE(apr_status_t) ap_filter_setaside_brigad
   (!fp->bb || APR_BRIGADE_EMPTY(fp->bb)) ? "empty" : "full",
   f->frec->name);

+/* Buckets in fp->bb are leftover from previous call to reinstate, so
+ * they happen after the buckets (re)set aside here.
+ */
+if (fp->bb) {
+APR_BRIGADE_CONCAT(bb, fp->bb);
+}
+

How could it happend that fp-bb is not empty here?

 if (!APR_BRIGADE_EMPTY(bb)) {
+apr_bucket_brigade *tmp_bb = NULL;
+
 /*
- * Set aside the brigade bb within fp->bb.
+ * Set aside the brigade bb to fp->bb.
  */
+
 ap_filter_prepare_brigade(f);
+do {
+apr_bucket *e = APR_BRIGADE_FIRST(bb);
+APR_BUCKET_REMOVE(e);

-/* decide what pool we setaside to, request pool or deferred pool? */
-if (f->r) {
-apr_bucket *e;
-for (e = APR_BRIGADE_FIRST(bb); e != APR_BRIGADE_SENTINEL(bb); e =
-APR_BUCKET_NEXT(e)) {
-if (APR_BUCKET_IS_TRANSIENT(e)) {
-int rv = apr_bucket_setaside(e, f->r->pool);
+/* Assume that morphing buckets have the correct lifetime! */
+if (AP_BUCKET_IS_MORPHING(e)) {
+if (tmp_bb && !APR_BRIGADE_EMPTY(tmp_bb)) {
+/* Save non-morphing buckets batched below. */
+rv = save_aside_brigade(fp, tmp_bb);
 if (rv != APR_SUCCESS) {
-return rv;
+APR_BRIGADE_PREPEND(bb, tmp_bb);
+break;
 }
 }
+APR_BRIGADE_INSERT_TAIL(fp->bb, e);
 }
-APR_BRIGADE_CONCAT(fp->bb, bb);
-}
-else {
-if (!fp->deferred_pool) {
-apr_pool_create(>deferred_pool, f->c->pool);
-apr_pool_tag(fp->deferred_pool, "deferred_pool");
+else {
+/* Save calls to ap_save_brigade() by batching successive
+ * non-morping buckets into a temporary brigade.
+ */
+if (!tmp_bb) {
+tmp_bb = ap_acquire_brigade(f->c);
+}
+APR_BRIGADE_INSERT_TAIL(tmp_bb, e);
 }
-return ap_save_brigade(f, >bb, , fp->deferred_pool);
+} while (!APR_BRIGADE_EMPTY(bb));
+
+if (tmp_bb) {
+/* Save any remainder. */
+if (!APR_BRIGADE_EMPTY(tmp_bb)) {
+rv = save_aside_brigade(fp, tmp_bb);
+APR_BRIGADE_PREPEND(bb, tmp_bb);

In which case is this needed? tmp_bb should be always empty unless 
ap_save_brigade fails.

+}
+ap_release_brigade(f->c, tmp_bb);
 }
-
 }
 else if (fp->deferred_pool) {
 /*


I think independent of the patch the API is not safe for use by request filters 
because once we return from the handler we insert
the EOR bucket at the level of connection filters (c->output_filters) or before 
the patch at the level of the request_core_filter:

Index: modules/http/http_request.c
===
--- modules/http/http_request.c (revision 1875498)
+++ modules/http/http_request.c (working copy)
@@ -350,7 +350,6 @@ AP_DECLARE(void) ap_process_request_after_handler(
 apr_bucket_brigade *bb;
 apr_bucket *b;
 conn_rec *c = r->connection;
-ap_filter_t *f;

 bb = ap_acquire_brigade(c);

@@ -371,15 +370,9 @@ AP_DECLARE(void) ap_process_request_after_handler(

 /* All the request filters should have bailed out on EOS, and in any
  * case they shouldn't have to handle this EOR which will destroy the
- * request underneath them. So go straight to the core request filter
- * which (if any) will take care of the setaside buckets.
+ * request underneath them. So go straight to the connection filters.
  */
-for (f = r->output_filters; f; f = f->next) {
-if (f->frec == ap_request_core_filter_handle) {
-break;
-}
-}
-ap_pass_brigade(f ? f : c->output_filters, bb);
+ap_pass_brigade(c->output_filters, bb);

 /* The EOR bucket has either been handled by an output filter (eg.
  * deleted or moved to a buffered_bb => no more in bb), or an error


But if request filters still have setaside brigades and these would be 
reactivated by ap_run_output_pending(c); in the
WRITE_COMPLETION phase they would write their response content after the EOR 
bucket to the filter chain, which actually mean that
the EOR 

Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-03-28 Thread Yann Ylavic
On Sat, Mar 28, 2020 at 11:24 AM Graham Leggett  wrote:
>
> On 27 Mar 2020, at 14:01, Yann Ylavic  wrote:
>
> >> We need to find the reason that in a non-async case, data is being 
> >> setaside, and we need to fix it.
> >
> > Connection and network output filters shouldn't care about async or
> > not, they just feed the pipe as much as possible, and setaside what
> > they can't (up to reinstate limits). For non-async this is pipelining,
> > let's not disable that.
> >
> > For non-async (though async would also benefit from it), what we need
> > to do is make sure that each request (worth handling in httpd) gets
> > finally EOR-ed, so that it's accounted (pipelining for non-async, or
> > blocking for async).
>
> Async shouldn’t block, that’s the whole point of being async.
>
> It only blocks in pathological cases, where the choice is between block, or 
> die because out-of-ram or out-of-file-handles.

That's precisely what we are accounting for in
ap_filter_reinstate_brigade() right?
And when the core/network output filter reaches its limits what else
can it do than block (if you don't want setaside RAM or FDs to go
unbounded)?
The current model is that if a handler (or WRITE_COMPLETION) doesn't
want the network filter to block, it shouldn't give the output filters
more data than they can setaside, and first test should_yield, and
never FLUSH.
This is easy for WRITE_COMPLETIION because it adds nothing (empty
brigade), so the output filters won't start to block all of a sudden
if they previously released the thread without (or after) blocking.
But what are handlers supposed to do when should_yield? Existing
filters know nothing about this, they need to be rewritten to go async
and be called back. I see no automagic way we can do this for them.

In the meantime, output filters will block unless their setaside data
fit the limits.

>
> I plan to add EAGAIN support to the core (that was the point behind a recent 
> POC patch), at this point all the blocking vanishes. Adding blocking back in 
> breaks this.

By core do you mean the default_handler()? mod_proxy handlers?
All ap_pass_brigade() callers could get EAGAIN?

>
> In my original design (network and connnection filters only),

(and TRANSCODE with ap_request_core_filter)

> In your design (add request filters)

Huh? No, I didn't add ap_request_core_filter(), it was there from the
start (r1706669).
I just tried to fix it several times, and failed (having a better
picture each time ;)
Anyway, the proposal to simply remove it should work for you then,
care to comment on the proposed patch (in this thread) which does
that?

Possibly you could tell us what is ap_request_core_filter() original point too?
I think it was (and still is) to not let morphing buckets through, and
if so I (now) think it's unnecessary.
Provided we trust the caller to pass either "request pooled" morphing
bucket (which we trust to precede EOR), or "connection pooled"
morphing buckets (it's not necessarily leaky to do so), setaside can
be a move from user brigade to filter brigade and vice versa.
And then the request core filter is useless, connection/network ones
can do their job.

Let's see how it works for both event and worker MPMs, when e.g. a 2MB
file is requested (suppose CONN_STATE_ is always READ_REQUEST_LINE for
mpm_worker, the rest in this common part is really identical for the
two MPMs, so it's easier to write it like this):

0.1  process_socket()
1.1CONN_STATE_READ_REQUEST_LINE?
2.1  process_connection()
3.1process_request()
4.1  _handler()
5.0ap_pass_brigade(next, 2MB.file.bucket, EOS)
6.0  core_output_filter(2MB.file.bucket, EOS)
6.1send_brigade_nonblocking() = EAGAIN/SUCCESS
6.2remaining > flush_max_threshold (say 512KB)
 block (POLLOUT)
 goto 6.1
6.3setaside EAGAIN data (say 256KB)
4.2  ap_pass_brigade(c->output_filters, EOR)

Now with mpm_event:
3.2CONN_STATE_WRITE_COMPLETION!
1.2CONN_STATE_WRITE_COMPLETION?
1.3.1some ap_run_output_pending()?
   WRITE_COMPLETION queue (POLLOUT)
1.3.2else
   CONN_STATE_READ_REQUEST_LINE! (almost)
   KEEPALIVE queue (POLLIN)
1.4CONN_STATE_LINGER?
 LINGER queue
1.5
0.2  time passes
0.3  socket event
0.4  goto 0

Whereas with mpm_worker:
4.3  no ap_run_input_pending()?
   ap_pass_brigade(c->output_filters, FLUSH)
3.2AP_CONN_KEEPALIVE?
 goto 2.1
0.2  ap_lingering_close(), including ap_flush_conn()
0.3  


So both MPMs might block in 6.2 (depending on whether the 1.5MB can be
sent nonblocking in a single pass or not), but at 6.3 they are sure
they can setaside what remains, and do so.

After that, with mpm_event the pending 256KB will be sent nonblocking
in WRITE_COMPLETION scheduling (by any thread available), and any next
request on the connection will be processed later (likely yet another
thread).

Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-03-28 Thread Graham Leggett
On 27 Mar 2020, at 14:01, Yann Ylavic  wrote:

>> We need to find the reason that in a non-async case, data is being setaside, 
>> and we need to fix it.
> 
> Connection and network output filters shouldn't care about async or
> not, they just feed the pipe as much as possible, and setaside what
> they can't (up to reinstate limits). For non-async this is pipelining,
> let's not disable that.
> 
> For non-async (though async would also benefit from it), what we need
> to do is make sure that each request (worth handling in httpd) gets
> finally EOR-ed, so that it's accounted (pipelining for non-async, or
> blocking for async).

Async shouldn’t block, that’s the whole point of being async.

It only blocks in pathological cases, where the choice is between block, or die 
because out-of-ram or out-of-file-handles.

Pipelining should definitely be supported in the async case.

I plan to add EAGAIN support to the core (that was the point behind a recent 
POC patch), at this point all the blocking vanishes. Adding blocking back in 
breaks this.

> The bug in the current code is about ap_request_core_filter()
> retaining data (because core should_yield) and returning, but not
> getting called again (because it's not a connection filter and the
> core and MPMs care only about that).
> Whether ap_request_core_filter() should block depending on
> async/non-async does not matter IMHO.

In my original design (network and connnection filters only), async MPMs got 
async support, and non-async MPMs behaved identically to httpd 2.4. The idea 
was that the door was always open to backport this, but having slept on it it’s 
a big change, so I never proposed it for backport.

In your design (add request filters), you added the additional behaviour to let 
all connection filters be async. Trouble is, this doesn’t work on non async 
MPMs, and here we are.

I argue, given we’re not backporting this, let’s just teach the other MPMs how 
to be async.

That means, at the end of handling every connection, run the appropriate hooks 
to completely flush out the setaside briagades, just like the event MPM and 
others do.

In other words, just make everything async in trunk and be done with it.

We know async works very well, because event is the default MPM, and that’s the 
most common path. Don’t leave the door open for weird bugs in the other MPMs.

Regards,
Graham
—



smime.p7s
Description: S/MIME cryptographic signature


Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-03-27 Thread Yann Ylavic
On Fri, Mar 27, 2020 at 11:54 AM Graham Leggett  wrote:
>
> We need to find the reason that in a non-async case, data is being setaside, 
> and we need to fix it.

Connection and network output filters shouldn't care about async or
not, they just feed the pipe as much as possible, and setaside what
they can't (up to reinstate limits). For non-async this is pipelining,
let's not disable that.

For non-async (though async would also benefit from it), what we need
to do is make sure that each request (worth handling in httpd) gets
finally EOR-ed, so that it's accounted (pipelining for non-async, or
blocking for async).

The bug in the current code is about ap_request_core_filter()
retaining data (because core should_yield) and returning, but not
getting called again (because it's not a connection filter and the
core and MPMs care only about that).
Whether ap_request_core_filter() should block depending on
async/non-async does not matter IMHO.

Regards,
Yann.


Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-03-27 Thread Graham Leggett
On 26 Mar 2020, at 13:41, Joe Orton  wrote:

> On Thu, Mar 26, 2020 at 01:11:10AM +0200, Graham Leggett wrote:
>> The question you’re asking is “why is is an async path being taken 
>> when AP_MPMQ_IS_ASYNC is false”. The setasides and reinstates should 
>> be noops in this situation.
> 
> The "noop" path in ap_filter_setaside_brigade is when it is passed an 
> empty brigade, right?

It’s not right, no.

To understand how this works, you need to understand how the core network 
filter works in Apache 2.4.

For the event MPM to be able to be an event MPM, there needs to be a mechanism 
by which buckets are not swallowed whole. Concrete example: 1TB file bucket. We 
cannot block forever trying to write 1TB of data to one client, starving all 
other clients of the opportunity to be sent data, and call ourselves event 
driven.

So, what the core network filter does is this:

- Reinstate earlier setaside data (if there was any).
- Try make non-blocking writes until the OS tells us it’s going to block.
- Setaside any unconsumed buckets, back off and let someone else have a go.

The core network filter also handles flow control. Pipelining is a thing, so we 
have to limit the number of requests in the entire request. File descriptors 
are a thing, so we need to track how many file buckets are in the brigade. 
Beyond a threshold, the core filter blocks to keep things sane.

The type of buckets in this case are irrelevant and always has been. They could 
be heap buckets with a fixed size, virtual buckets of some kind like file 
buckets that are a fixed size but access their data by reference, or morphing 
buckets with no fixed size at all, or EOR buckets that clean up a request when 
consumed, or an EOC bucket that cleans up a connection being consumed.

Now, lets look at trunk.

This was generalised so that any filter can do what the core network filter can 
do:

- Reinstate earlier setaside data (if there was any).
- Ask if the OS is going to block - this is the should_yield. If the answer is 
“no”, the filter writes data. If the answer is always “no”, old filter 
behaviour applies, the entire brigade will get written as it did, and the core 
will block as it did.
- If should_yield says “yes” there will data left over, set the data aside, and 
let someone else have a go. If should_yield is always “no”, setaside will 
always be empty in every filter.

What happens when we enter write completion? Async MPMs will keep writing empty 
brigades down the filter stack, until every last setaside bucket is flushed 
out, then we’re done and we can handle the next request when it arrives.

Now let’s look at non-async MPMs:

All non-async MPMs must return “no” to should_yield. This means that all data 
in the filters gets written, and no data gets set aside.

Because no data is set aside, there is no need to send any empty brigades down 
the stack to coax any unwritten setaside data to go down the chain, so existing 
non-async MPMs work unchanged.

This is what "The setasides and reinstates should be noops in this situation” 
means.

Now let’s look at non async filters:

Non async filters don't stop until their entire brigade is written, all 1TB of 
it. If all the filter did was counting bucket sizes, filter passes the entire 
lot down and steps aside, no harm done. But if the filter transformed the data, 
eg by compressing it, all 1TB gets eaten in one go. Oh well. Still works though.

So, the bug.

It looks like async filters work fine.

We are seeing the non-async case not working properly.

We need to find the reason that in a non-async case, data is being setaside, 
and we need to fix it.

Regards,
Graham
—



smime.p7s
Description: S/MIME cryptographic signature


Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-03-26 Thread Yann Ylavic
On Thu, Mar 26, 2020 at 12:18 PM Ruediger Pluem  wrote:
>
> On 3/26/20 12:11 AM, Graham Leggett wrote:
> >
> > What it does do is work around buggy request filters (set it to connection) 
> > or buggy connection filters (set it to network and get 2.4 behaviour)..

What it does is read/morph buckets before the connection filters, is
because we can't pass some "buggy request_filter" morphing bucket to
the core/ssl output filters? Or more exactly to further
ap_filter_setaside_brigade()?

If some filter (beside core ones) need to apr_bucket_read() morphing
buckets it possibly should use ap_filter_reinstate_brigade_ex() with a
new AP_FILTER_REINSTATE_MORPH option (probably accompanied by
AP_FILTER_REINSTATE_NONBLOCKING for the usual/optional first
nonblocking then flush then blocking danse).

> >
> > Set AsyncFilter appropriately and you’ll at least narrow it down.
> >
> > The question you’re asking is “why is is an async path being taken when 
> > AP_MPMQ_IS_ASYNC is false”. The setasides and reinstates should be noops in 
> > this situation.
>
> Having the setasides and reinstates as noops in this situation might help, 
> but as far as I can tell they make no difference
> whether the MPM is sync or async.
> What makes you think that an async path is taken?

Not until WRITE_COMPLETION.

> Setting AsyncFilter to connection will very likely help (not tested yet) as 
> it effectively removes the ap_request_core_filter.

I think I'm convinced we don't need the core request filter.
It does nothing different than next ssl_io_filter_output() or
ap_core_output_filter() it seems, so is it just a matter of morphing
buckets lifetime?

Thanks to the EOR bucket mechanism, we assume that objects owned by a
request can be used safely provided they precede the EOR, assumption
which IMHO we can extend to morphing buckets. If so, I think we can
avoid the apr_bucket_setaside() = ENOTIMPL issue by simply moving
morphing buckets from/to user and filter brigades in
ap_filter_setaside/reinstate_brigade().
Just like FILE buckets, morphing buckets are just file descriptors
that use no memory until read, so if we move them on
setaside/reinstate we can just ignore them for flushing heuristic and
flush_max_threshold accounting.

So ap_filter_reinstate_brigade() would *flush_upto in-memory limits
only (and FLUSH of course), anything else can wait.
The blocking limit is flush_max_pipelined then, but we could split
flush_max_threshold into more fine grained _max_in_memory_threshold
and _max_total_threshold if needed.

Wouldn't that work (patch attached, passes test framework and Joe's repro)?


Regards,
Yann.
Index: server/core.c
===
--- server/core.c	(revision 1875498)
+++ server/core.c	(working copy)
@@ -122,7 +122,6 @@ AP_IMPLEMENT_HOOK_RUN_FIRST(apr_status_t, insert_n
 
 /* Handles for core filters */
 AP_DECLARE_DATA ap_filter_rec_t *ap_subreq_core_filter_handle;
-AP_DECLARE_DATA ap_filter_rec_t *ap_request_core_filter_handle;
 AP_DECLARE_DATA ap_filter_rec_t *ap_core_output_filter_handle;
 AP_DECLARE_DATA ap_filter_rec_t *ap_content_length_filter_handle;
 AP_DECLARE_DATA ap_filter_rec_t *ap_core_input_filter_handle;
@@ -5916,9 +5915,6 @@ static void register_hooks(apr_pool_t *p)
 ap_core_output_filter_handle =
 ap_register_output_filter("CORE", ap_core_output_filter,
   NULL, AP_FTYPE_NETWORK);
-ap_request_core_filter_handle =
-ap_register_output_filter("REQ_CORE", ap_request_core_filter,
-  NULL, AP_FTYPE_CONNECTION - 1);
 ap_subreq_core_filter_handle =
 ap_register_output_filter("SUBREQ_CORE", ap_sub_req_output_filter,
   NULL, AP_FTYPE_CONTENT_SET);
Index: modules/http/http_core.c
===
--- modules/http/http_core.c	(revision 1875498)
+++ modules/http/http_core.c	(working copy)
@@ -268,8 +268,6 @@ static int http_create_request(request_rec *r)
 NULL, r, r->connection);
 ap_add_output_filter_handle(ap_http_outerror_filter_handle,
 NULL, r, r->connection);
-ap_add_output_filter_handle(ap_request_core_filter_handle,
-NULL, r, r->connection);
 }
 
 return OK;
Index: server/request.c
===
--- server/request.c	(revision 1875498)
+++ server/request.c	(working copy)
@@ -2058,124 +2058,6 @@ AP_CORE_DECLARE_NONSTD(apr_status_t) ap_sub_req_ou
 return APR_SUCCESS;
 }
 
-AP_CORE_DECLARE_NONSTD(apr_status_t) ap_request_core_filter(ap_filter_t *f,
-apr_bucket_brigade *bb)
-{
-apr_status_t status = APR_SUCCESS;
-apr_read_type_e block = APR_NONBLOCK_READ;
-conn_rec *c = f->r->connection;
-apr_bucket *flush_upto = NULL;
-

Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-03-26 Thread Joe Orton
On Thu, Mar 26, 2020 at 01:11:10AM +0200, Graham Leggett wrote:
> The question you’re asking is “why is is an async path being taken 
> when AP_MPMQ_IS_ASYNC is false”. The setasides and reinstates should 
> be noops in this situation.

The "noop" path in ap_filter_setaside_brigade is when it is passed an 
empty brigade, right?  It is trivial to see this is not true for worker 
based off the logging for the tests which are triggering this:

sh-5.0$ > t/logs/error_log 
sh-5.0$ ./t/TEST t/apache/passbrigade.t &>/dev/null 
sh-5.0$ grep resuming t/logs/error_log 
[Thu Mar 26 11:26:28.632069 2020] [mpm_worker:notice] [pid 60770:tid 
139707977312256] AH00292: Apache/2.5.1-dev (Unix) OpenSSL/1.1.1d configured -- 
resuming normal operations
sh-5.0$ grep -c 'setaside full' t/logs/error_log 
392






Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-03-26 Thread Ruediger Pluem



On 3/26/20 12:11 AM, Graham Leggett wrote:
> On 24 Mar 2020, at 11:47, Joe Orton  wrote:
> 
>> On Tue, Mar 24, 2020 at 12:35:45AM +0200, Graham Leggett wrote:
>>> The most likely reason is because:

> 
>>  If this is the correct default I guess I'd ask 
>> why isn't this hard-coded - at least maybe for affected MPMs?  Maybe try 
>> it and see if the Travis failures go away.
> 
> It is not the correct default, no.
> 
> What it does do is work around buggy request filters (set it to connection) 
> or buggy connection filters (set it to network and get 2.4 behaviour)..
> 
> Set AsyncFilter appropriately and you’ll at least narrow it down.
> 
> The question you’re asking is “why is is an async path being taken when 
> AP_MPMQ_IS_ASYNC is false”. The setasides and reinstates should be noops in 
> this situation.

Having the setasides and reinstates as noops in this situation might help, but 
as far as I can tell they make no difference
whether the MPM is sync or async.
What makes you think that an async path is taken?
Setting AsyncFilter to connection will very likely help (not tested yet) as it 
effectively removes the ap_request_core_filter.

Regards

Rüdiger



Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-03-25 Thread Graham Leggett
On 24 Mar 2020, at 11:47, Joe Orton  wrote:

> On Tue, Mar 24, 2020 at 12:35:45AM +0200, Graham Leggett wrote:
>> The most likely reason is because:
>> 
>> a) http://httpd.apache.org/docs/trunk/mod/core.html#asyncfilter defaults to 
>> request, meaning request filters are expected to support async filters; and
>> 
>> b) the worker MPM doesn't call the ap_run_output_pending() hook, and 
>> so by definition does not support async filters.
>> 
>> The fix (as far as I can see) is for the worker MPM to run the 
>> ap_run_output_pending() hook.
> 
> Only 3/8 trunk MPMs have that call, so those are all broken right now?  
> I know we have seen this fail for both prefork and worker in Travis.

They shouldn't be broken, because only 3/8 trunk MPMs declare that they support 
AP_MPMQ_IS_ASYNC.

>> What happens if you set "AsyncFilter connection”? Does it skip the problem?
> 
> Are you unable to repro using the script I posted in the thread Ruediger 
> referenced to confirm?

Not without a lot of retooling in my builds in the middle of an office move 
with no notice ahead of a pending countrywide shutdown, so no alas.

>  If this is the correct default I guess I'd ask 
> why isn't this hard-coded - at least maybe for affected MPMs?  Maybe try 
> it and see if the Travis failures go away.

It is not the correct default, no.

What it does do is work around buggy request filters (set it to connection) or 
buggy connection filters (set it to network and get 2.4 behaviour)..

Set AsyncFilter appropriately and you’ll at least narrow it down.

The question you’re asking is “why is is an async path being taken when 
AP_MPMQ_IS_ASYNC is false”. The setasides and reinstates should be noops in 
this situation.

Regards,
Graham
—



smime.p7s
Description: S/MIME cryptographic signature


Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-03-24 Thread Joe Orton
On Tue, Mar 24, 2020 at 12:35:45AM +0200, Graham Leggett wrote:
> The most likely reason is because:
> 
> a) http://httpd.apache.org/docs/trunk/mod/core.html#asyncfilter defaults to 
> request, meaning request filters are expected to support async filters; and
> 
> b) the worker MPM doesn't call the ap_run_output_pending() hook, and 
> so by definition does not support async filters.
> 
> The fix (as far as I can see) is for the worker MPM to run the 
> ap_run_output_pending() hook.

Only 3/8 trunk MPMs have that call, so those are all broken right now?  
I know we have seen this fail for both prefork and worker in Travis.

> What happens if you set "AsyncFilter connection”? Does it skip the problem?

Are you unable to repro using the script I posted in the thread Ruediger 
referenced to confirm?  If this is the correct default I guess I'd ask 
why isn't this hard-coded - at least maybe for affected MPMs?  Maybe try 
it and see if the Travis failures go away.

Regards, Joe



Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-03-23 Thread Graham Leggett
On 23 Mar 2020, at 14:53, Ruediger Pluem  wrote:

> To sum it up:
> 
> ap_request_core_filter might setaside an EOR and data before this and return,

That’s normal.

The core supports pipelining, meaning multiple requests, and therefore multiple 
EOR buckets can be lined up inside the core network filter (2.4) or any async 
filter (trunk).

There is a safety check that doesn’t allow more than X EOR buckets (and 
therefore request pools) and no more than Y file buckets (and therefore file 
descriptors) in the pipeline at once, this triggers a block until capacity has 
been freed, but other than that, we’re fine so far.

> but it is never called again,

If we were in the event MPM we would be called again by the 
ap_run_output_pending() hook, which allows the request filters to be repeatedly 
called until all the filter stacks at all levels are empty:

https://github.com/apache/httpd/blob/40d37b8a304f93cd14def4e9eab887b00a3d0e78/server/mpm/event/event.c#L1150

I see no call to ap_run_output_pending() in the worker MPM:

https://github.com/apache/httpd/blob/trunk/server/mpm/worker/worker.c

> because after EOR was
> sent only connection and later filters (c->output_filters) are called 
> afterwards, but nothing before.

The most likely reason is because:

a) http://httpd.apache.org/docs/trunk/mod/core.html#asyncfilter defaults to 
request, meaning request filters are expected to support async filters; and

b) the worker MPM doesn't call the ap_run_output_pending() hook, and so by 
definition does not support async filters.

The fix (as far as I can see) is for the worker MPM to run the 
ap_run_output_pending() hook.

What happens if you set "AsyncFilter connection”? Does it skip the problem?

Regards,
Graham
—



smime.p7s
Description: S/MIME cryptographic signature


Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-03-23 Thread Ruediger Pluem



On 3/23/20 11:32 AM, Graham Leggett wrote:
> On 23 Mar 2020, at 09:40, Ruediger Pluem  > wrote:
> 
>> In short: The async filter code currently corrupts responses in certain 
>> situations. For the whole story please look here:
>>
>> https://lists.apache.org/thread.html/r94fc303b3d2d8d0a057c150cbbcf8841b313e4de9a97b79203ac62a5%40%3Cdev.httpd.apache.org%3E
>>
>> I know that my proposal is not the final solution, but at least it avoids 
>> response corruption until a better solution is
>> available. Maybe Yann already has a better one in his latest post to the 
>> original thread.
> 
> I’ve read the thread a number of times, and I’l not been able to fully 
> understand it.
> 
> Can someone describe the problem specifically?


I guess the best description is from here (Joe at 1/7/20, 5:06 PM):

> [Tue Jan 07 13:09:32.090818 2020] [:info] [pid 117876:tid 140235421763328] 
> [client 127.0.0.1:41112] [mod_test_pass_brigade]
wrote 8192 of 8192 bytes
> [Tue Jan 07 13:09:32.090821 2020] [:info] [pid 117876:tid 140235421763328] 
> [client 127.0.0.1:41112] [mod_test_pass_brigade] done
writing 1024 of 1024 bytes
> [Tue Jan 07 13:09:32.090824 2020] [core:trace6] [pid 117876:tid 
> 140235421763328] util_filter.c(1015): [client 127.0.0.1:41112]
reinstate full brigade to full brigade in 'req_core' output filter
> [Tue Jan 07 13:09:32.090827 2020] [core:trace8] [pid 117876:tid 
> 140235421763328] util_filter.c(1133): [client 127.0.0.1:41112]
brigade contains: bytes: 49205, non-file bytes: 49205, eor buckets: 1, morphing 
buckets
> : 0
> [Tue Jan 07 13:09:32.090829 2020] [core:trace6] [pid 117876:tid 
> 140235421763328] util_filter.c(942): [client 127.0.0.1:41112]
setaside full brigade to empty brigade in 'req_core' output filter
> [Tue Jan 07 13:09:32.090835 2020] [core:trace6] [pid 117876:tid 
> 140235421763328] util_filter.c(1015): [client 127.0.0.1:41112]
reinstate full brigade to full brigade in 'core' output filter
>
> >From the above we can see there is 48K of data buffered in one of the·
> filters.  At this point we've passed through:
>
> /* Prepend buckets set aside, if any. */
> ap_filter_reinstate_brigade(f, bb, NULL);
> if (APR_BRIGADE_EMPTY(bb)) {
> return APR_SUCCESS;
> }
>
> I then dumped the contents of bb to the error log directly after:
>
> [Tue Jan 07 13:09:32.090837 2020] [core:trace1] [pid 117876:tid 
> 140235421763328] core_filters.c(373): [client 127.0.0.1:41112]
sbb: cof - bucket 0 HEAP length 1654
> [Tue Jan 07 13:09:32.090839 2020] [core:trace1] [pid 117876:tid 
> 140235421763328] core_filters.c(373): [client 127.0.0.1:41112]
sbb: cof - bucket 1 IMMORTAL length 2
> [Tue Jan 07 13:09:32.090842 2020] [core:trace1] [pid 117876:tid 
> 140235421763328] core_filters.c(373): [client 127.0.0.1:41112]
sbb: cof - bucket 2 FLUSH length 0
> [Tue Jan 07 13:09:32.090844 2020] [core:trace1] [pid 117876:tid 
> 140235421763328] core_filters.c(373): [client 127.0.0.1:41112]
sbb: cof - bucket 3 EOC length 0
>
> so the 48K does not make it back to the core output filter and is never
> sent.
>

and the analysis fro Yann (1/7/20, 7:31 PM) here:

> > [Tue Jan 07 13:09:32.090818 2020] [:info] [pid 117876:tid 140235421763328] 
> > [client 127.0.0.1:41112] [mod_test_pass_brigade]
wrote 8192 of 8192 bytes
> > [Tue Jan 07 13:09:32.090821 2020] [:info] [pid 117876:tid 140235421763328] 
> > [client 127.0.0.1:41112] [mod_test_pass_brigade]
done writing 1024 of 1024 bytes
> > [Tue Jan 07 13:09:32.090824 2020] [core:trace6] [pid 117876:tid 
> > 140235421763328] util_filter.c(1015): [client 127.0.0.1:41112]
reinstate full brigade to full brigade in 'req_core' output filter
> > [Tue Jan 07 13:09:32.090827 2020] [core:trace8] [pid 117876:tid 
> > 140235421763328] util_filter.c(1133): [client 127.0.0.1:41112]
brigade contains: bytes: 49205, non-file bytes: 49205, eor buckets: 1, morphing 
buckets
> > : 0
> > [Tue Jan 07 13:09:32.090829 2020] [core:trace6] [pid 117876:tid 
> > 140235421763328] util_filter.c(942): [client 127.0.0.1:41112]
setaside full brigade to empty brigade in 'req_core' output filter
>
> Here, the call stack is:
>   ap_process_request()
>   => ap_process_async_request()
>  => ap_process_request_after_handler()
>=> ap_pass_brigade()
>  => ap_request_core_filter()
> so ap_request_core_filter() is called with the EOR bucket, but since
> there are still pending/setaside/unsent data in the core filter then
> ap_request_core_filter() returns without passing all of its own
> pending data (including the EOR).
>
> > [Tue Jan 07 13:09:32.090835 2020] [core:trace6] [pid 117876:tid 
> > 140235421763328] util_filter.c(1015): [client 127.0.0.1:41112]
reinstate full brigade to full brigade in 'core' output filter
>
> Here, we are now at:
>   ap_process_request()
>   => ap_pass_brigade(c->output_filters, ...)
> but ap_request_core_filter() is not a connection filter (i.e. not part
> of c->output_filters), so it will never be called again.
> 

Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-03-23 Thread Graham Leggett
On 23 Mar 2020, at 09:40, Ruediger Pluem  wrote:

> In short: The async filter code currently corrupts responses in certain 
> situations. For the whole story please look here:
> 
> https://lists.apache.org/thread.html/r94fc303b3d2d8d0a057c150cbbcf8841b313e4de9a97b79203ac62a5%40%3Cdev.httpd.apache.org%3E
>  
> 
> 
> I know that my proposal is not the final solution, but at least it avoids 
> response corruption until a better solution is
> available. Maybe Yann already has a better one in his latest post to the 
> original thread.

I’ve read the thread a number of times, and I’l not been able to fully 
understand it.

Can someone describe the problem specifically?

I see lots of surprise that morphing buckets (length = -1) end up in the core - 
this is all normal. You’ll see this in the existing core filter in v2.4.

Regards,
Graham
—



smime.p7s
Description: S/MIME cryptographic signature


Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-03-23 Thread Ruediger Pluem



On 3/23/20 1:49 AM, Graham Leggett wrote:
> On 06 Mar 2020, at 10:30, Ruediger Pluem  wrote:
> 
>> Anyone found time to check
>>
>> https://github.com/apache/httpd/pull/88
>>
>> regarding the async filter bug?
> 
> Do these PRs come through on a mailing list anywhere? If there is no 
> notification, they’ll never get seen.
> 
> I immediately trip over "Ensure that we always pass on all we have” - that’s 
> the exact opposite of what we’re trying to do.
> 
> All the async filter stuff is, is generalising the algorithm that has long 
> existing in the core network filter, allowing any filter to take advantage of 
> the same pattern.
> 
> As soon as we try “pass all we have”, that would break the event MPM, which 
> relies on small writes in turn to work.
> 
> Can you explain the problem you’re trying to solve?
> 

In short: The async filter code currently corrupts responses in certain 
situations. For the whole story please look here:

https://lists.apache.org/thread.html/r94fc303b3d2d8d0a057c150cbbcf8841b313e4de9a97b79203ac62a5%40%3Cdev.httpd.apache.org%3E

I know that my proposal is not the final solution, but at least it avoids 
response corruption until a better solution is
available. Maybe Yann already has a better one in his latest post to the 
original thread.

Regards

Rüdiger




Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-03-22 Thread Graham Leggett
On 06 Mar 2020, at 10:30, Ruediger Pluem  wrote:

> Anyone found time to check
> 
> https://github.com/apache/httpd/pull/88
> 
> regarding the async filter bug?

Do these PRs come through on a mailing list anywhere? If there is no 
notification, they’ll never get seen.

I immediately trip over "Ensure that we always pass on all we have” - that’s 
the exact opposite of what we’re trying to do.

All the async filter stuff is, is generalising the algorithm that has long 
existing in the core network filter, allowing any filter to take advantage of 
the same pattern.

As soon as we try “pass all we have”, that would break the event MPM, which 
relies on small writes in turn to work.

Can you explain the problem you’re trying to solve?

Regards,
Graham
—



smime.p7s
Description: S/MIME cryptographic signature


Re: async filters still borked (was Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis)

2020-03-22 Thread Graham Leggett
On 20 Mar 2020, at 15:46, Joe Orton  wrote:

>> Anyone found time to check
>> 
>> https://github.com/apache/httpd/pull/88
>> 
>> regarding the async filter bug?
> 
> [crickets]
> 
> We are two months on since discussing this first.  It seems quite 
> worrying that the async filters stuff was introduced with apparently 
> nasty bugs and yet is already bitrotting on trunk with not enough people 
> to review and fix it.  Can we move the code to a branch until it is 
> ready?

First time I've seen this thread - can you be more specific, what nasty bugs 
are you referring to?

I see http://svn.apache.org/viewvc?view=revision=1874775 refers to 
"The async filter bug”, but there is no PR, or any indication of what the bug 
is?

Regards,
Graham
—



smime.p7s
Description: S/MIME cryptographic signature


Re: async filters still borked (was Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis)

2020-03-22 Thread Yann Ylavic
On Fri, Mar 20, 2020 at 2:46 PM Joe Orton  wrote:
>
> [crickets]
>
> We are two months on since discussing this first.  It seems quite
> worrying that the async filters stuff was introduced with apparently
> nasty bugs and yet is already bitrotting on trunk with not enough people
> to review and fix it.

Sorry for not being able to work on this so far, just proposed a new
patch in the original thread.

> Can we move the code to a branch until it is
> ready?

This probably wouldn't be easy, there are some further changes (on
trunk) that depend on the async filters, like
ap_proxy_transfer_between_connections() shared by several proxy
modules now, plus mpm_event obviously.
But if we can't make async filters work with reasonable changes, sure
let's move it out of trunk.

Regards,
Yann.


Re: async filters still borked (was Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis)

2020-03-20 Thread Ruediger Pluem



On 3/20/20 2:46 PM, Joe Orton wrote:
> On Fri, Mar 06, 2020 at 09:30:41AM +0100, Ruediger Pluem wrote:
>> On 3/4/20 9:23 AM, jor...@apache.org wrote:
>>> Author: jorton
>>> Date: Wed Mar  4 08:23:55 2020
>>> New Revision: 1874775
>>>
>>> URL: http://svn.apache.org/viewvc?rev=1874775=rev
>>> Log:
>>> Update docs. The expr_string.t failure has not been seen since 
>>> the workaround was added AFAICT.  The async filter bug
>>> is still breaking the tests regularly. [skip ci].
>>
>> Anyone found time to check
>>
>> https://github.com/apache/httpd/pull/88
>>
>> regarding the async filter bug?
> 
> [crickets]
> 
> We are two months on since discussing this first.  It seems quite 
> worrying that the async filters stuff was introduced with apparently 
> nasty bugs and yet is already bitrotting on trunk with not enough people 
> to review and fix it.  Can we move the code to a branch until it is 
> ready?

Maybe better, yes. So +1.

Regards

Rüdiger


Re: async filters still borked (was Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis)

2020-03-20 Thread Jim Jagielski
+1 from me.

> On Mar 20, 2020, at 9:46 AM, Joe Orton  wrote:
> 
> On Fri, Mar 06, 2020 at 09:30:41AM +0100, Ruediger Pluem wrote:
>> On 3/4/20 9:23 AM, jor...@apache.org wrote:
>>> Author: jorton
>>> Date: Wed Mar  4 08:23:55 2020
>>> New Revision: 1874775
>>> 
>>> URL: http://svn.apache.org/viewvc?rev=1874775=rev
>>> Log:
>>> Update docs. The expr_string.t failure has not been seen since 
>>> the workaround was added AFAICT.  The async filter bug
>>> is still breaking the tests regularly. [skip ci].
>> 
>> Anyone found time to check
>> 
>> https://github.com/apache/httpd/pull/88
>> 
>> regarding the async filter bug?
> 
> [crickets]
> 
> We are two months on since discussing this first.  It seems quite 
> worrying that the async filters stuff was introduced with apparently 
> nasty bugs and yet is already bitrotting on trunk with not enough people 
> to review and fix it.  Can we move the code to a branch until it is 
> ready?
> 
> Regards, Joe
> 



async filters still borked (was Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis)

2020-03-20 Thread Joe Orton
On Fri, Mar 06, 2020 at 09:30:41AM +0100, Ruediger Pluem wrote:
> On 3/4/20 9:23 AM, jor...@apache.org wrote:
> > Author: jorton
> > Date: Wed Mar  4 08:23:55 2020
> > New Revision: 1874775
> > 
> > URL: http://svn.apache.org/viewvc?rev=1874775=rev
> > Log:
> > Update docs. The expr_string.t failure has not been seen since 
> > the workaround was added AFAICT.  The async filter bug
> > is still breaking the tests regularly. [skip ci].
> 
> Anyone found time to check
> 
> https://github.com/apache/httpd/pull/88
> 
> regarding the async filter bug?

[crickets]

We are two months on since discussing this first.  It seems quite 
worrying that the async filters stuff was introduced with apparently 
nasty bugs and yet is already bitrotting on trunk with not enough people 
to review and fix it.  Can we move the code to a branch until it is 
ready?

Regards, Joe



Re: svn commit: r1874775 - /httpd/httpd/trunk/test/README.travis

2020-03-06 Thread Ruediger Pluem



On 3/4/20 9:23 AM, jor...@apache.org wrote:
> Author: jorton
> Date: Wed Mar  4 08:23:55 2020
> New Revision: 1874775
> 
> URL: http://svn.apache.org/viewvc?rev=1874775=rev
> Log:
> Update docs. The expr_string.t failure has not been seen since 
> the workaround was added AFAICT.  The async filter bug
> is still breaking the tests regularly. [skip ci].

Anyone found time to check

https://github.com/apache/httpd/pull/88

regarding the async filter bug?

Regards

Rüdiger