On Nov 29, 2012, at 7:42 AM, Erik A Johnson <[email protected]> wrote: >>>> No, the test to bug out doesn't work because net_geterror(proxy->fd_ssl) >>>> returns 0 in the statement >>>> >>>> if (!proxy->client_proxy && net_geterror(proxy->fd_ssl) == ENOTCONN) {
I was wrong here: the FIRST time, net_geterror returns EBADF = "The argument socket is not a valid file descriptor" (and then 0 on subsequent calls). On November 29, 2012 at 12:43:42 PM PST, Timo Sirainen <[email protected]> wrote: >>> I wonder if something like would work: >>> >>> if (!proxy->client && read(proxy->fd_ssl, &err, 0) < 0 && errno == >>> ENOTCONN) { On November 29, 2012 at 2:12:18 PM PST, Ben Morrow <[email protected]> wrote: >> How about calling getpeername on fd_ssl? That should reliably tell you >> if the socket is connected or not. http://cr.yp.to/docs/connect.html >> suggests that read is not always a reliable test for that. Thanks, Ben. On November 29, 2012 at 2:39:51 PM PST, Timo Sirainen <[email protected]> wrote: > Yes, that sounds like it would work better: > > if (!proxy->client && net_getpeername(proxy->fd_ssl, NULL, NULL) < 0 && > errno == ENOTCONN) { Using getpeername or net_getpeername, errno is set to EINVAL = "socket has been shut down", so we could instead use if (!proxy->client && net_getpeername(proxy->fd_ssl, NULL, NULL) < 0 && errno == EINVAL) { So it seems that we have the following options: 1. net_geterror(proxy->fd_ssl) == EBADF 2. read(proxy->fd_ssl, &err, 0) < 0 && errno == ENOTCONN 3. net_getpeername(proxy->fd_ssl, NULL, NULL) < 0 && errno == EINVAL Which is preferable? Should the "#ifdef __APPLE__" remain? or would any of these tests be appropriate for other platforms as well? On Nov 28, 2012, at 10:18PM PST, Timo Sirainen <[email protected]> wrote: >>>>> This is either OSX bug or OpenSSL bug.. Apparently what happens is: >>>>> >>>>> 1. Client sends SYN packet to Dovecot >>>>> 2. Dovecot accept()s the connection (sends SYN-ACK) and goes into OpenSSL >>>>> code >>>>> 3. Client doesn't send ACK to Dovecot. Does it send RST or nothing or >>>>> something else? I don't know. >>>>> 4. OSX notices anyway that something is wrong with the socket, and kqueue >>>>> says that the socket is ready for reading >>>>> 5. OpenSSL read()s, which fails with ENOTCONN. But OpenSSL thinks this is >>>>> a non-fatal error and simply asks to be notified again when something can >>>>> be read >>>>> 6. goto 4 >>>>> >>>>> So, whose bug is it? OpenSSL's ENOTCONN handling probably makes sense for >>>>> client connections where connect() hasn't finished yet. But then again, >>>>> this is accept()ed connection where it typically should fail like that. >>>>> Except I guess it might be correct behavior if read() is done after >>>>> SYN-ACK but before receiving ACK. >>>>> >>>>> While OSX is receiving ACK from the client, it shouldn't say that the fd >>>>> is readable. It probably doesn't. But after it receives <something> it >>>>> realizes that the socket is disconnected. So read() probably shouldn't be >>>>> returning ENOTCONN anymore at this point, but instead ECONNRESET or >>>>> ETIMEDOUT. >>>>> >>>>> See if the attached patch helps. >>>>> >>>>> >>>>> On 29.11.2012, at 7.45, Erik A Johnson wrote: >>>>>> Here's the log: >>>>>> >>>>>> Nov 28 21:28:11 macbookpro-e17d.home dovecot[54139]: master: Dovecot >>>>>> v2.1.10 starting up (core dumps disabled) >>>>>> Nov 28 21:30:19 macbookpro-e17d.home dovecot[54141]: imap-login: Debug: >>>>>> ssl_step() >>>>>> Nov 28 21:30:19 macbookpro-e17d.home dovecot[54141]: imap-login: Debug: >>>>>> ssl_handshake: SSL_accept()=-1 >>>>>> Nov 28 21:30:19 macbookpro-e17d.home dovecot[54141]: imap-login: Debug: >>>>>> SSL_get_error() = 2 >>>>>> Nov 28 21:30:19 macbookpro-e17d.home dovecot[54141]: imap-login: Debug: >>>>>> - want_read >>>>>> Nov 28 21:30:19 macbookpro-e17d.home dovecot[54141]: imap-login: Debug: >>>>>> ssl_set_io(0) >>>>>> [last 5 lines are repeated until process is killed] >>>>>> >>>>>> On Nov 26, 2012, as 11:38PM PST, Timo Sirainen <[email protected]> wrote: >>>>>>> >>>>>>> Could you try with the attached patch, and with only the problematic >>>>>>> client running? What does it log (the beginning of the session until it >>>>>>> starts repeating the same lines)? >>>>>>> >>>>>>> On 10.11.2012, at 12.44, Erik A Johnson wrote: >>>>>>>> imap-login processes are hanging (using 100% of CPU) when connected >>>>>>>> from a client that is partially blocked by a firewall. It appears >>>>>>>> that imap-login is stuck in a loop trying to complete an ssl >>>>>>>> handshake. imap-login is working fine for other clients not blocked >>>>>>>> by the firewall (including localhost). >>>>>>>> >>>>>>>> This is dovecot 2.1.10 under Mac OS X 10.8.2 (compiled from sources); >>>>>>>> the firewall is Little Snitch 3.0.1 blocking port 993, which appears >>>>>>>> to let the connection initiate but then squashes and disconnects the >>>>>>>> socket during ssl handshaking. >>>>>>>> >>>>>>>> gdb backtrace and Activity Monitor's "Sample Process" show that >>>>>>>> imap-login is stuck calling ioloop-kqueue's io_loop_handler_run -> >>>>>>>> io_loop_call_io -> ssl_step repeatedly; dtruss shows that it is >>>>>>>> repeatedly making system calls to kevent and read, the latter >>>>>>>> returning -1 with errno 57=ENOTCONN="Socket is not connected". (I >>>>>>>> also tried ./configure --with-ioloop=poll and --with-iopoll=select >>>>>>>> instead of the default best = kqueue but the results were the same; >>>>>>>> --with-iopoll=epoll didn't work because epoll is not available on this >>>>>>>> machine.) The client, initiated by the command "openssl s_client >>>>>>>> -connect SERVER:993", first responds "CONNECTED(00000003)" but then >>>>>>>> immediately the error "60278:error:140790E5:SSL >>>>>>>> routines:SSL23_WRITE:ssl handshake >>>>>>>> failure:/SourceCache/OpenSSL098/OpenSSL098-44/src/ssl/s23_lib.c:182:". >>>>>>>> The infinite loop is in src/lib/ioloop.c in the function >>>>>>>> "io_loop_run" where the statement "while (ioloop->running) >>>>>>>> io_loop_handler_run(ioloop)" is executed. >>>
