Re: h2 broken in 2.4.36 with OpenSSL 1.1.1? Related to SSL_MODE_AUTO_RETRY?
Hi Stefan, Joe and all, Am 16.10.2018 um 11:15 schrieb Joe Orton: On Mon, Oct 15, 2018 at 12:55:45PM +0200, Rainer Jung wrote: I'm currently testing the following patch which looks OK wrt. test suite results. Need to run more combinations (OpenSSL version client versus server) though. Server with 1.1.1 and with 1.0.2p both look OK (including the h2 tests). Maybe some cases could be folded together or be dropped, but I tried to make the logic changes not to big. The SSL_ERROR_ZERO_RETURN part is new, because without that we get an ssl:info log line AH01992 with error 6 (SSL_ERROR_ZERO_RETURN) at the end of the response (at least with 1.1.1). Thanks a lot Rainer & Stefan, sorry I didn't follow through on that ticket/issue far enough. Strike it down as another way that 1.1.1 really is ABI-incompatible with <1.1.1 :( The change committed to ssl_engine_io.c makes sense to me. I wonder if mod_ssl should also handle SSL_ERROR_WANT_WRITE here as well. It will be clearly logged if that happens ("SSL library error 3 reading data") so we should find out anyway. Thanks to you both for double checking. I tried to keep behavior change restricted to the observed problem but yes, it might be, that a more complete approach would cover more cases, that we are just not aware right now. It is just that I don't feel myself in a position for that more complete approach. I will propose the current trunk change for 2.4 but any suggestions for improvements would be highly welcome. My testing currently at least shows no problems with the httpd test suite. Regards, Rainer
Re: h2 broken in 2.4.36 with OpenSSL 1.1.1? Related to SSL_MODE_AUTO_RETRY?
On Mon, Oct 15, 2018 at 12:55:45PM +0200, Rainer Jung wrote: > I'm currently testing the following patch which looks OK wrt. test suite > results. Need to run more combinations (OpenSSL version client versus > server) though. Server with 1.1.1 and with 1.0.2p both look OK (including > the h2 tests). Maybe some cases could be folded together or be dropped, but > I tried to make the logic changes not to big. The SSL_ERROR_ZERO_RETURN part > is new, because without that we get an ssl:info log line AH01992 with error > 6 (SSL_ERROR_ZERO_RETURN) at the end of the response (at least with 1.1.1). Thanks a lot Rainer & Stefan, sorry I didn't follow through on that ticket/issue far enough. Strike it down as another way that 1.1.1 really is ABI-incompatible with <1.1.1 :( The change committed to ssl_engine_io.c makes sense to me. I wonder if mod_ssl should also handle SSL_ERROR_WANT_WRITE here as well. It will be clearly logged if that happens ("SSL library error 3 reading data") so we should find out anyway. Regards, Joe
Re: h2 broken in 2.4.36 with OpenSSL 1.1.1? Related to SSL_MODE_AUTO_RETRY?
> Am 15.10.2018 um 23:16 schrieb Rainer Jung : > > Adjusted SSL_read() rc value 0 handling applied in r1843954 to trunk. I'll > let it sit there until tomorrow for comments and then suggest for backport. Thanks, Rainer! The h2 test suite runs as smoothly as it did before on my machine with OpenSSL 1.1.1 in trunk. I am all for backporting and testing von 2.4.x on this. -Stefan > > Am 15.10.2018 um 12:55 schrieb Rainer Jung: >> Am 15.10.2018 um 10:02 schrieb Stefan Eissing: >>> >>> Am 14.10.2018 um 00:46 schrieb Rainer Jung : It seems the h2 failure only happens when building httpd against OpenSSL 1.1.1 (independent of TLS version used). I did a quick check with an httpd build against 1.1.0i and there the same vhost of the test framework instance worked with the same clients, that failed for 1.1.1. The client side OpenSSL version seems not to matter. When using 1.1.1 in the server even browsers seem to fail with h2. Anyone who can successfully use h2 with 2.4.36 build against OpenSSL 1.1.1? >>> >>> Me, and I got reports from others. >>> When comparing trace logs between the 1.1.1 server and the 1.1.0i server, one can see, that a failing OpenSSL read (0 bytes) results in APR_EOF (70014) for 1.1.1 but in errno 11 (EAGAIN) for 1.1.0i. I wonder whether this is due to the new SSL_MODE_AUTO_RETRY and maybe the following change: +#if OPENSSL_VERSION_NUMBER >= 0x1010100fL +/* For OpenSSL >=1.1.1, disable auto-retry mode so it's possible + * to consume handshake records without blocking for app-data. + * https://github.com/openssl/openssl/issues/7178 */ +SSL_CTX_clear_mode(ctx, SSL_MODE_AUTO_RETRY); +#endif >>> >>> Hmm, just read the ticket made by Joe regarding this change. >>> >>> I try to summarize my understanding: >>> >>> - this has nothing to do with HTTP/2 in particular. It's only triggered by >>> the h2 calling sequence. >>> - SSL_read() only returns transport data, but sometimes reads (part of) TLS >>> meta data, such as handshake messages. >>> - When it reads this meta data (and has processed it), it can either do >>> another read directly after or return with >>>its equivalent of EAGAIN. By default, it does the first. >>> - The change in handshake handling between TLSv1.2 and TLSv1.3 led, in >>> Joe's testing, to the case where SSL_read()'s >>>internal re-read() on the socket blocked, as the client was not sending >>> anything. >>> - So, after discussion on the openssl issue tracker, Joe changed the >>> OpenSSL behaviour of this by inserting the code above. >>> - Now, when SSL_read() is called, processes some meta data, it will not >>> read() on the socket again, but return a read of length 0. >>> - mod_ssl interprets this as EOF: >>> ssl_engine_io.c, lines 673.. >>> >>> else if (rc == 0) { >>> /* If EAGAIN, we will loop given a blocking read, >>> * otherwise consider ourselves at EOF. >>> */ >>> >>> and returns APR_EOF to h2 which then shuts the connection down. >>> >>> To my understanding, mod_ssl's ssl_engine_io.c, ssl_io_input_read() has the >>> special case handling that: >>> - if it gets a SSL_ERROR_WANT_READ from SSL_read(), it remembers that, >>> calls again >>> - if SSL_read() == 0, non-blocking, it >>>a) on having seen SSL_ERROR_WANT_READ before, return APR_EGAIN >>>b) otherwise, return APR_EOF >>> >>> OpenSSL's documentation of SSL_read() now states: >>> Old documentation indicated a difference between 0 and -1, and that -1 was retry-able. You should instead call SSL_get_error() to find out if it's retry-able. >>> >>> So, we should change our logic here. Anyone having the courage to do so? ;-) >> Thanks for the further investigations. >> I'm currently testing the following patch which looks OK wrt. test suite >> results. Need to run more combinations (OpenSSL version client versus >> server) though. Server with 1.1.1 and with 1.0.2p both look OK (including >> the h2 tests). Maybe some cases could be folded together or be dropped, but >> I tried to make the logic changes not to big. The SSL_ERROR_ZERO_RETURN part >> is new, because without that we get an ssl:info log line AH01992 with error >> 6 (SSL_ERROR_ZERO_RETURN) at the end of the response (at least with 1.1.1). >> --- modules/ssl/ssl_engine_io.c.orig2018-08-15 17:01:08.0 +0200 >> +++ modules/ssl/ssl_engine_io.c 2018-10-15 11:46:01.258042000 +0200 >> @@ -680,35 +680,32 @@ >> } >> return inctx->rc; >> } >> -else if (rc == 0) { >> -/* If EAGAIN, we will loop given a blocking read, >> - * otherwise consider ourselves at EOF. >> - */ >> -if (APR_STATUS_IS_EAGAIN(inctx->rc) >> -|| APR_STATUS_IS_EINTR(inctx->rc)) { >> -/* Already read something, return
Re: h2 broken in 2.4.36 with OpenSSL 1.1.1? Related to SSL_MODE_AUTO_RETRY?
Adjusted SSL_read() rc value 0 handling applied in r1843954 to trunk. I'll let it sit there until tomorrow for comments and then suggest for backport. Am 15.10.2018 um 12:55 schrieb Rainer Jung: Am 15.10.2018 um 10:02 schrieb Stefan Eissing: Am 14.10.2018 um 00:46 schrieb Rainer Jung : It seems the h2 failure only happens when building httpd against OpenSSL 1.1.1 (independent of TLS version used). I did a quick check with an httpd build against 1.1.0i and there the same vhost of the test framework instance worked with the same clients, that failed for 1.1.1. The client side OpenSSL version seems not to matter. When using 1.1.1 in the server even browsers seem to fail with h2. Anyone who can successfully use h2 with 2.4.36 build against OpenSSL 1.1.1? Me, and I got reports from others. When comparing trace logs between the 1.1.1 server and the 1.1.0i server, one can see, that a failing OpenSSL read (0 bytes) results in APR_EOF (70014) for 1.1.1 but in errno 11 (EAGAIN) for 1.1.0i. I wonder whether this is due to the new SSL_MODE_AUTO_RETRY and maybe the following change: +#if OPENSSL_VERSION_NUMBER >= 0x1010100fL + /* For OpenSSL >=1.1.1, disable auto-retry mode so it's possible + * to consume handshake records without blocking for app-data. + * https://github.com/openssl/openssl/issues/7178 */ + SSL_CTX_clear_mode(ctx, SSL_MODE_AUTO_RETRY); +#endif Hmm, just read the ticket made by Joe regarding this change. I try to summarize my understanding: - this has nothing to do with HTTP/2 in particular. It's only triggered by the h2 calling sequence. - SSL_read() only returns transport data, but sometimes reads (part of) TLS meta data, such as handshake messages. - When it reads this meta data (and has processed it), it can either do another read directly after or return with its equivalent of EAGAIN. By default, it does the first. - The change in handshake handling between TLSv1.2 and TLSv1.3 led, in Joe's testing, to the case where SSL_read()'s internal re-read() on the socket blocked, as the client was not sending anything. - So, after discussion on the openssl issue tracker, Joe changed the OpenSSL behaviour of this by inserting the code above. - Now, when SSL_read() is called, processes some meta data, it will not read() on the socket again, but return a read of length 0. - mod_ssl interprets this as EOF: ssl_engine_io.c, lines 673.. else if (rc == 0) { /* If EAGAIN, we will loop given a blocking read, * otherwise consider ourselves at EOF. */ and returns APR_EOF to h2 which then shuts the connection down. To my understanding, mod_ssl's ssl_engine_io.c, ssl_io_input_read() has the special case handling that: - if it gets a SSL_ERROR_WANT_READ from SSL_read(), it remembers that, calls again - if SSL_read() == 0, non-blocking, it a) on having seen SSL_ERROR_WANT_READ before, return APR_EGAIN b) otherwise, return APR_EOF OpenSSL's documentation of SSL_read() now states: Old documentation indicated a difference between 0 and -1, and that -1 was retry-able. You should instead call SSL_get_error() to find out if it's retry-able. So, we should change our logic here. Anyone having the courage to do so? ;-) Thanks for the further investigations. I'm currently testing the following patch which looks OK wrt. test suite results. Need to run more combinations (OpenSSL version client versus server) though. Server with 1.1.1 and with 1.0.2p both look OK (including the h2 tests). Maybe some cases could be folded together or be dropped, but I tried to make the logic changes not to big. The SSL_ERROR_ZERO_RETURN part is new, because without that we get an ssl:info log line AH01992 with error 6 (SSL_ERROR_ZERO_RETURN) at the end of the response (at least with 1.1.1). --- modules/ssl/ssl_engine_io.c.orig 2018-08-15 17:01:08.0 +0200 +++ modules/ssl/ssl_engine_io.c 2018-10-15 11:46:01.258042000 +0200 @@ -680,35 +680,32 @@ } return inctx->rc; } - else if (rc == 0) { - /* If EAGAIN, we will loop given a blocking read, - * otherwise consider ourselves at EOF. - */ - if (APR_STATUS_IS_EAGAIN(inctx->rc) - || APR_STATUS_IS_EINTR(inctx->rc)) { - /* Already read something, return APR_SUCCESS instead. - * On win32 in particular, but perhaps on other kernels, - * a blocking call isn't 'always' blocking. + else /* (rc <= 0) */ { + if (rc == 0) { + /* If EAGAIN, we will loop given a blocking read, + * otherwise consider ourselves at EOF. */ - if (*len > 0) { - inctx->rc = APR_SUCCESS; - break; - } - if (inctx->block == APR_NONBLOCK_READ) { - break; - } -
Re: h2 broken in 2.4.36 with OpenSSL 1.1.1? Related to SSL_MODE_AUTO_RETRY?
Am 15.10.2018 um 10:02 schrieb Stefan Eissing: Am 14.10.2018 um 00:46 schrieb Rainer Jung : It seems the h2 failure only happens when building httpd against OpenSSL 1.1.1 (independent of TLS version used). I did a quick check with an httpd build against 1.1.0i and there the same vhost of the test framework instance worked with the same clients, that failed for 1.1.1. The client side OpenSSL version seems not to matter. When using 1.1.1 in the server even browsers seem to fail with h2. Anyone who can successfully use h2 with 2.4.36 build against OpenSSL 1.1.1? Me, and I got reports from others. When comparing trace logs between the 1.1.1 server and the 1.1.0i server, one can see, that a failing OpenSSL read (0 bytes) results in APR_EOF (70014) for 1.1.1 but in errno 11 (EAGAIN) for 1.1.0i. I wonder whether this is due to the new SSL_MODE_AUTO_RETRY and maybe the following change: +#if OPENSSL_VERSION_NUMBER >= 0x1010100fL +/* For OpenSSL >=1.1.1, disable auto-retry mode so it's possible + * to consume handshake records without blocking for app-data. + * https://github.com/openssl/openssl/issues/7178 */ +SSL_CTX_clear_mode(ctx, SSL_MODE_AUTO_RETRY); +#endif Hmm, just read the ticket made by Joe regarding this change. I try to summarize my understanding: - this has nothing to do with HTTP/2 in particular. It's only triggered by the h2 calling sequence. - SSL_read() only returns transport data, but sometimes reads (part of) TLS meta data, such as handshake messages. - When it reads this meta data (and has processed it), it can either do another read directly after or return with its equivalent of EAGAIN. By default, it does the first. - The change in handshake handling between TLSv1.2 and TLSv1.3 led, in Joe's testing, to the case where SSL_read()'s internal re-read() on the socket blocked, as the client was not sending anything. - So, after discussion on the openssl issue tracker, Joe changed the OpenSSL behaviour of this by inserting the code above. - Now, when SSL_read() is called, processes some meta data, it will not read() on the socket again, but return a read of length 0. - mod_ssl interprets this as EOF: ssl_engine_io.c, lines 673.. else if (rc == 0) { /* If EAGAIN, we will loop given a blocking read, * otherwise consider ourselves at EOF. */ and returns APR_EOF to h2 which then shuts the connection down. To my understanding, mod_ssl's ssl_engine_io.c, ssl_io_input_read() has the special case handling that: - if it gets a SSL_ERROR_WANT_READ from SSL_read(), it remembers that, calls again - if SSL_read() == 0, non-blocking, it a) on having seen SSL_ERROR_WANT_READ before, return APR_EGAIN b) otherwise, return APR_EOF OpenSSL's documentation of SSL_read() now states: Old documentation indicated a difference between 0 and -1, and that -1 was retry-able. You should instead call SSL_get_error() to find out if it's retry-able. So, we should change our logic here. Anyone having the courage to do so? ;-) Thanks for the further investigations. I'm currently testing the following patch which looks OK wrt. test suite results. Need to run more combinations (OpenSSL version client versus server) though. Server with 1.1.1 and with 1.0.2p both look OK (including the h2 tests). Maybe some cases could be folded together or be dropped, but I tried to make the logic changes not to big. The SSL_ERROR_ZERO_RETURN part is new, because without that we get an ssl:info log line AH01992 with error 6 (SSL_ERROR_ZERO_RETURN) at the end of the response (at least with 1.1.1). --- modules/ssl/ssl_engine_io.c.orig2018-08-15 17:01:08.0 +0200 +++ modules/ssl/ssl_engine_io.c 2018-10-15 11:46:01.258042000 +0200 @@ -680,35 +680,32 @@ } return inctx->rc; } -else if (rc == 0) { -/* If EAGAIN, we will loop given a blocking read, - * otherwise consider ourselves at EOF. - */ -if (APR_STATUS_IS_EAGAIN(inctx->rc) -|| APR_STATUS_IS_EINTR(inctx->rc)) { -/* Already read something, return APR_SUCCESS instead. - * On win32 in particular, but perhaps on other kernels, - * a blocking call isn't 'always' blocking. +else /* (rc <= 0) */ { +if (rc == 0) { +/* If EAGAIN, we will loop given a blocking read, + * otherwise consider ourselves at EOF. */ -if (*len > 0) { -inctx->rc = APR_SUCCESS; -break; -} -if (inctx->block == APR_NONBLOCK_READ) { -break; -} -} -else { -if (*len > 0) { -inctx->rc = APR_SUCCESS; +if (APR_STATUS_IS_EAGAIN(inctx->rc) +||
Re: h2 broken in 2.4.36 with OpenSSL 1.1.1? Related to SSL_MODE_AUTO_RETRY?
> Am 14.10.2018 um 00:46 schrieb Rainer Jung : > > It seems the h2 failure only happens when building httpd against OpenSSL > 1.1.1 (independent of TLS version used). I did a quick check with an httpd > build against 1.1.0i and there the same vhost of the test framework instance > worked with the same clients, that failed for 1.1.1. > > The client side OpenSSL version seems not to matter. > > When using 1.1.1 in the server even browsers seem to fail with h2. > > Anyone who can successfully use h2 with 2.4.36 build against OpenSSL 1.1.1? Me, and I got reports from others. > When comparing trace logs between the 1.1.1 server and the 1.1.0i server, one > can see, that a failing OpenSSL read (0 bytes) results in APR_EOF (70014) for > 1.1.1 but in errno 11 (EAGAIN) for 1.1.0i. I wonder whether this is due to > the new SSL_MODE_AUTO_RETRY and maybe the following change: > > +#if OPENSSL_VERSION_NUMBER >= 0x1010100fL > +/* For OpenSSL >=1.1.1, disable auto-retry mode so it's possible > + * to consume handshake records without blocking for app-data. > + * https://github.com/openssl/openssl/issues/7178 */ > +SSL_CTX_clear_mode(ctx, SSL_MODE_AUTO_RETRY); > +#endif Hmm, just read the ticket made by Joe regarding this change. I try to summarize my understanding: - this has nothing to do with HTTP/2 in particular. It's only triggered by the h2 calling sequence. - SSL_read() only returns transport data, but sometimes reads (part of) TLS meta data, such as handshake messages. - When it reads this meta data (and has processed it), it can either do another read directly after or return with its equivalent of EAGAIN. By default, it does the first. - The change in handshake handling between TLSv1.2 and TLSv1.3 led, in Joe's testing, to the case where SSL_read()'s internal re-read() on the socket blocked, as the client was not sending anything. - So, after discussion on the openssl issue tracker, Joe changed the OpenSSL behaviour of this by inserting the code above. - Now, when SSL_read() is called, processes some meta data, it will not read() on the socket again, but return a read of length 0. - mod_ssl interprets this as EOF: ssl_engine_io.c, lines 673.. else if (rc == 0) { /* If EAGAIN, we will loop given a blocking read, * otherwise consider ourselves at EOF. */ and returns APR_EOF to h2 which then shuts the connection down. To my understanding, mod_ssl's ssl_engine_io.c, ssl_io_input_read() has the special case handling that: - if it gets a SSL_ERROR_WANT_READ from SSL_read(), it remembers that, calls again - if SSL_read() == 0, non-blocking, it a) on having seen SSL_ERROR_WANT_READ before, return APR_EGAIN b) otherwise, return APR_EOF OpenSSL's documentation of SSL_read() now states: > Old documentation indicated a difference between 0 and -1, and that -1 was > retry-able. > You should instead call SSL_get_error() to find out if it's retry-able. So, we should change our logic here. Anyone having the courage to do so? ;-) -Stefan > > The logs: > > < OpenSSL 1.1.1 > > OpenSSL 1.1.0i > > < 23:35:16.021201 h2_session.c(1704): AH03078: h2_session(81,BUSY,0): > transit [INIT] -- init --> [BUSY] > > 23:39:55.715439 h2_session.c(1670): AH03078: h2_session(66,BUSY,0): > > transit [INIT] -- init --> [BUSY] > > Some additional early stuff for 1.1.0i > > > 23:39:55.715448 h2_filter.c(145): h2_session(66): read, NONBLOCK_READ, > > mode=0, readbytes=65536 > > 23:39:55.715470 ssl_engine_io.c(): OpenSSL: I/O error, 5 bytes expected > > to read on BIO#7efee0005280 [mem: 7efee0014ee3] > > 23:39:55.715480 h2_filter.c(190): (11)Resource temporarily unavailable: > > h2_session(66): read > > 23:39:55.715508 h2_mplx.c(1240): h2_mplx(66): dispatch events > > 23:39:55.715536 h2_session.c(589): AH03068: h2_session(66,BUSY,0): sent > > FRAME[SETTINGS[length=6, stream=0]], frames=0/1 (r/s) > > 23:39:55.715548 h2_session.c(589): AH03068: h2_session(66,BUSY,0): sent > > FRAME[WINDOW_UPDATE[stream=0, incr=2147418112]], frames=0/2 (r/s) > > 23:39:55.715565 h2_conn_io.c(122): h2_session(66)-out: heap[28] flush > > 23:39:55.715590 core_filters.c(580): brigade contains: bytes: 57, non-file > > bytes: 57, eor buckets: 0, morphing buckets: 0 > > 23:39:55.715598 ssl_engine_io.c(2213): OpenSSL: write 57/57 bytes to > > BIO#7efee00021b0 [mem: 7efee0014ee3] > > 23:39:55.715648 core_filters.c(525): will flush because of FLUSH bucket > > 23:39:55.715653 core_filters.c(535): seen in brigade so far: bytes: 57, > > non-file bytes: 57, eor buckets: 0, morphing buckets: 0 > > 23:39:55.715657 core_filters.c(554): flushing now > > 23:39:55.715683 core_filters.c(569): total bytes written: 2466 > > 23:39:55.715690 core_filters.c(580): brigade contains: bytes: 0, non-file > > bytes: 0, eor buckets: 0, morphing buckets: 0 > > Then many things identical > > < 23:35:16.021217 h2_filter.c(145): h2_session(81): read, NONBLOCK_READ, > mode=0,