On Tue, 09 Apr 2019 at 17:26:22 +0200, gregor herrmann wrote: > On Tue, 09 Apr 2019 17:14:32 +0200, Guilhem Moulin wrote: >> With TLS 1.3? (You can pass ‘SSL_version => "TLSv1_3"’ to ssl_opts to >> force this.) Doesn't work here, still hangs on read(): > > Yes, also with using TLSv1_3 explicitly: > […] > (trace attached in case it helps)
AFAICT this worked this time because the socket was *only* marked as
ready for writing after the first select() call. Only during the second
call was there some data to be read:
> select(8, [3], [3], NULL, {tv_sec=180, tv_usec=0}) = 1 (out [3], left
> {tv_sec=179, tv_usec=999996})
> select(8, [3], NULL, NULL, {tv_sec=180, tv_usec=0}) = 1 (in [3], left
> {tv_sec=179, tv_usec=977469})
I'm unable to reproduce this with v1.3, probably due to race conditions.
Anyway I fail to see how the patch can help, because as I wrote in
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=914034#101 the socket
is in blocking mode (hence SSL_MODE_AUTO_RETRY is set) by the time LWP
starts its select loop, and SSL_MODE_AUTO_RETRY is set. This is visible
by adding fcntl(2) to the traced set of system calls:
$ strace -etrace=fcntl,select,read perl -MLWP::UserAgent -MIO::Socket::SSL
-e \
'$IO::Socket::SSL::DEBUG = 3;
LWP::UserAgent->new(ssl_opts => {SSL_version =>
"TLSv1_3"})->post("https://facebook.com", { data => "" })'
[…]
fcntl(3, F_GETFL) = 0x2 (flags O_RDWR)
fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
DEBUG: .../IO/Socket/SSL.pm:831: set socket to non-blocking to enforce
timeout=180
DEBUG: .../IO/Socket/SSL.pm:844: call Net::SSLeay::connect
read(3, 0x5628bec16923, 5) = -1 EAGAIN (Resource temporarily
unavailable)
DEBUG: .../IO/Socket/SSL.pm:847: done Net::SSLeay::connect -> -1
DEBUG: .../IO/Socket/SSL.pm:857: ssl handshake in progress
DEBUG: .../IO/Socket/SSL.pm:867: waiting for fd to become ready: SSL wants
a read first
select(8, [3], NULL, NULL, {tv_sec=180, tv_usec=0}) = 1 (in [3], left
{tv_sec=179, tv_usec=988296})
DEBUG: .../IO/Socket/SSL.pm:887: socket ready, retrying connect
DEBUG: .../IO/Socket/SSL.pm:844: call Net::SSLeay::connect
[…]
DEBUG: .../IO/Socket/SSL.pm:847: done Net::SSLeay::connect -> 1
DEBUG: .../IO/Socket/SSL.pm:902: ssl handshake done
fcntl(3, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK)
fcntl(3, F_SETFL, O_RDWR) = 0
[…]
select(8, [3], [3], NULL, {tv_sec=180, tv_usec=0}) = 2 (in [3], out [3],
left {tv_sec=179, tv_usec=999998})
read(3, "…", 5) = 5
read(3, "…", 156) = 156
read(3,
When the non-application record comes in, the socket is marked as ready
for reading, but SSL_read() transparently processes the non-application
data record, and blocks on trying to read an application data record.
If one is lucky and the socket is *only* marked as ready for writing (ie
not for reading as well, like in your trace) when select() returns then
the problem doesn't trigger (at least not right after the handshake —
OTOH it might occur later on renegotiation), but AFAICT it's orthogonal
to whether the patch is applied or not: we use blocking I/O, so
SSL_MODE_AUTO_RETRY is set just like before (`Net::SSLeay::set_mode($ssl,
$mode_auto_retry)` is called just before clearing O_NONBLOCK).
If the (blocking) socket is marked for reading when select() returns,
then the application assumes that SSL_read() won't block, and setting
SSL_MODE_AUTO_RETRY breaks that assumption, as written in the OpenSSL
changelog. Instead of a blocking SSL_read() the application expects it
to return SSL_ERROR_WANT_READ. And proceeds with SSL_write() if the
socket is also ready for writing, like in the trace above.
--
Guilhem.
signature.asc
Description: PGP signature

