Bug#914034: Bug#911938: libhttp-daemon-ssl-perl FTBFS: tests fail: Connection refused

Guilhem Moulin Tue, 09 Apr 2019 14:42:18 -0700

On Tue, 09 Apr 2019 at 17:26:22 +0200, gregor herrmann wrote:
> On Tue, 09 Apr 2019 17:14:32 +0200, Guilhem Moulin wrote:
>> With TLS 1.3?  (You can pass ‘SSL_version => "TLSv1_3"’ to ssl_opts to
>> force this.)  Doesn't work here, still hangs on read():
> 
> Yes, also with using TLSv1_3 explicitly:
> […]
> (trace attached in case it helps)


AFAICT this worked this time because the socket was *only* marked as
ready for writing after the first select() call.  Only during the second
call was there some data to be read:

> select(8, [3], [3], NULL, {tv_sec=180, tv_usec=0}) = 1 (out [3], left 
> {tv_sec=179, tv_usec=999996})
> select(8, [3], NULL, NULL, {tv_sec=180, tv_usec=0}) = 1 (in [3], left 
> {tv_sec=179, tv_usec=977469})

I'm unable to reproduce this with v1.3, probably due to race conditions.
Anyway I fail to see how the patch can help, because as I wrote in
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=914034#101 the socket
is in blocking mode (hence SSL_MODE_AUTO_RETRY is set) by the time LWP
starts its select loop, and SSL_MODE_AUTO_RETRY is set.  This is visible
by adding fcntl(2) to the traced set of system calls:

    $ strace -etrace=fcntl,select,read perl -MLWP::UserAgent -MIO::Socket::SSL 
-e \
        '$IO::Socket::SSL::DEBUG = 3;
         LWP::UserAgent->new(ssl_opts => {SSL_version => 
"TLSv1_3"})->post("https://facebook.com";, { data => "" })'
    […]
    fcntl(3, F_GETFL)                       = 0x2 (flags O_RDWR)
    fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
    DEBUG: .../IO/Socket/SSL.pm:831: set socket to non-blocking to enforce 
timeout=180
    DEBUG: .../IO/Socket/SSL.pm:844: call Net::SSLeay::connect
    read(3, 0x5628bec16923, 5)              = -1 EAGAIN (Resource temporarily 
unavailable)
    DEBUG: .../IO/Socket/SSL.pm:847: done Net::SSLeay::connect -> -1
    DEBUG: .../IO/Socket/SSL.pm:857: ssl handshake in progress
    DEBUG: .../IO/Socket/SSL.pm:867: waiting for fd to become ready: SSL wants 
a read first
    select(8, [3], NULL, NULL, {tv_sec=180, tv_usec=0}) = 1 (in [3], left 
{tv_sec=179, tv_usec=988296})
    DEBUG: .../IO/Socket/SSL.pm:887: socket ready, retrying connect
    DEBUG: .../IO/Socket/SSL.pm:844: call Net::SSLeay::connect
    […]
    DEBUG: .../IO/Socket/SSL.pm:847: done Net::SSLeay::connect -> 1
    DEBUG: .../IO/Socket/SSL.pm:902: ssl handshake done
    fcntl(3, F_GETFL)                       = 0x802 (flags O_RDWR|O_NONBLOCK)
    fcntl(3, F_SETFL, O_RDWR)               = 0
    […]
    select(8, [3], [3], NULL, {tv_sec=180, tv_usec=0}) = 2 (in [3], out [3], 
left {tv_sec=179, tv_usec=999998})
    read(3, "…", 5)   = 5
    read(3, "…", 156) = 156
    read(3,

When the non-application record comes in, the socket is marked as ready
for reading, but SSL_read() transparently processes the non-application
data record, and blocks on trying to read an application data record.

If one is lucky and the socket is *only* marked as ready for writing (ie
not for reading as well, like in your trace) when select() returns then
the problem doesn't trigger (at least not right after the handshake —
OTOH it might occur later on renegotiation), but AFAICT it's orthogonal
to whether the patch is applied or not: we use blocking I/O, so
SSL_MODE_AUTO_RETRY is set just like before (`Net::SSLeay::set_mode($ssl,
$mode_auto_retry)` is called just before clearing O_NONBLOCK).

If the (blocking) socket is marked for reading when select() returns,
then the application assumes that SSL_read() won't block, and setting
SSL_MODE_AUTO_RETRY breaks that assumption, as written in the OpenSSL
changelog.  Instead of a blocking SSL_read() the application expects it
to return SSL_ERROR_WANT_READ.  And proceeds with SSL_write() if the
socket is also ready for writing, like in the trace above.

-- 
Guilhem.

signature.asc
Description: PGP signature

Bug#914034: Bug#911938: libhttp-daemon-ssl-perl FTBFS: tests fail: Connection refused

Reply via email to