On Nov 29, 2012, at 7:42 AM, Erik A Johnson <[email protected]> wrote:
>>>> No, the test to bug out doesn't work because net_geterror(proxy->fd_ssl) 
>>>> returns 0 in the statement
>>>> 
>>>>   if (!proxy->client_proxy && net_geterror(proxy->fd_ssl) == ENOTCONN) {

I was wrong here: the FIRST time, net_geterror returns EBADF = "The argument 
socket is not a valid file descriptor" (and then 0 on subsequent calls).

On November 29, 2012 at 12:43:42 PM PST, Timo Sirainen <[email protected]> wrote:
>>> I wonder if something like would work:
>>> 
>>> if (!proxy->client && read(proxy->fd_ssl, &err, 0) < 0 && errno == 
>>> ENOTCONN) {

On November 29, 2012 at 2:12:18 PM PST, Ben Morrow <[email protected]> wrote:
>> How about calling getpeername on fd_ssl? That should reliably tell you
>> if the socket is connected or not. http://cr.yp.to/docs/connect.html
>> suggests that read is not always a reliable test for that.

Thanks, Ben.

On November 29, 2012 at 2:39:51 PM PST, Timo Sirainen <[email protected]> wrote:
> Yes, that sounds like it would work better:
> 
> if (!proxy->client && net_getpeername(proxy->fd_ssl, NULL, NULL)  < 0 && 
> errno == ENOTCONN) {

Using getpeername or net_getpeername, errno is set to EINVAL = "socket has been 
shut down", so we could instead use 

   if (!proxy->client && net_getpeername(proxy->fd_ssl, NULL, NULL)  < 0 && 
errno == EINVAL) {

So it seems that we have the following options:

1.  net_geterror(proxy->fd_ssl) == EBADF
2.  read(proxy->fd_ssl, &err, 0) < 0  &&  errno == ENOTCONN
3.  net_getpeername(proxy->fd_ssl, NULL, NULL) < 0  &&  errno == EINVAL

Which is preferable?  Should the "#ifdef __APPLE__" remain? or would any of 
these tests be appropriate for other platforms as well?

On Nov 28, 2012, at 10:18PM PST, Timo Sirainen <[email protected]> wrote:
>>>>> This is either OSX bug or OpenSSL bug.. Apparently what happens is:
>>>>> 
>>>>> 1. Client sends SYN packet to Dovecot
>>>>> 2. Dovecot accept()s the connection (sends SYN-ACK) and goes into OpenSSL 
>>>>> code
>>>>> 3. Client doesn't send ACK to Dovecot. Does it send RST or nothing or 
>>>>> something else? I don't know.
>>>>> 4. OSX notices anyway that something is wrong with the socket, and kqueue 
>>>>> says that the socket is ready for reading
>>>>> 5. OpenSSL read()s, which fails with ENOTCONN. But OpenSSL thinks this is 
>>>>> a non-fatal error and simply asks to be notified again when something can 
>>>>> be read
>>>>> 6. goto 4
>>>>> 
>>>>> So, whose bug is it? OpenSSL's ENOTCONN handling probably makes sense for 
>>>>> client connections where connect() hasn't finished yet. But then again, 
>>>>> this is accept()ed connection where it typically should fail like that. 
>>>>> Except I guess it might be correct behavior if read() is done after 
>>>>> SYN-ACK but before receiving ACK.
>>>>> 
>>>>> While OSX is receiving ACK from the client, it shouldn't say that the fd 
>>>>> is readable. It probably doesn't. But after it receives <something> it 
>>>>> realizes that the socket is disconnected. So read() probably shouldn't be 
>>>>> returning ENOTCONN anymore at this point, but instead ECONNRESET or 
>>>>> ETIMEDOUT.
>>>>> 
>>>>> See if the attached patch helps.
>>>>> 
>>>>> 
>>>>> On 29.11.2012, at 7.45, Erik A Johnson wrote:
>>>>>> Here's the log:
>>>>>> 
>>>>>> Nov 28 21:28:11 macbookpro-e17d.home dovecot[54139]: master: Dovecot 
>>>>>> v2.1.10 starting up (core dumps disabled)
>>>>>> Nov 28 21:30:19 macbookpro-e17d.home dovecot[54141]: imap-login: Debug: 
>>>>>> ssl_step()
>>>>>> Nov 28 21:30:19 macbookpro-e17d.home dovecot[54141]: imap-login: Debug: 
>>>>>> ssl_handshake: SSL_accept()=-1
>>>>>> Nov 28 21:30:19 macbookpro-e17d.home dovecot[54141]: imap-login: Debug: 
>>>>>> SSL_get_error() = 2
>>>>>> Nov 28 21:30:19 macbookpro-e17d.home dovecot[54141]: imap-login: Debug:  
>>>>>> - want_read
>>>>>> Nov 28 21:30:19 macbookpro-e17d.home dovecot[54141]: imap-login: Debug: 
>>>>>> ssl_set_io(0)
>>>>>> [last 5 lines are repeated until process is killed]
>>>>>> 
>>>>>> On Nov 26, 2012, as 11:38PM PST, Timo Sirainen <[email protected]> wrote:
>>>>>>> 
>>>>>>> Could you try with the attached patch, and with only the problematic
>>>>>>> client running? What does it log (the beginning of the session until it
>>>>>>> starts repeating the same lines)?
>>>>>>> 
>>>>>>> On 10.11.2012, at 12.44, Erik A Johnson wrote:
>>>>>>>> imap-login processes are hanging (using 100% of CPU) when connected 
>>>>>>>> from a client that is partially blocked by a firewall.  It appears 
>>>>>>>> that imap-login is stuck in a loop trying to complete an ssl 
>>>>>>>> handshake.  imap-login is working fine for other clients not blocked 
>>>>>>>> by the firewall (including localhost).
>>>>>>>> 
>>>>>>>> This is dovecot 2.1.10 under Mac OS X 10.8.2 (compiled from sources); 
>>>>>>>> the firewall is Little Snitch 3.0.1 blocking port 993, which appears 
>>>>>>>> to let the connection initiate but then squashes and disconnects the 
>>>>>>>> socket during ssl handshaking.
>>>>>>>> 
>>>>>>>> gdb backtrace and Activity Monitor's "Sample Process" show that 
>>>>>>>> imap-login is stuck calling ioloop-kqueue's io_loop_handler_run -> 
>>>>>>>> io_loop_call_io -> ssl_step repeatedly; dtruss shows that it is 
>>>>>>>> repeatedly making system calls to kevent and read, the latter 
>>>>>>>> returning -1 with errno 57=ENOTCONN="Socket is not connected".  (I 
>>>>>>>> also tried ./configure --with-ioloop=poll and --with-iopoll=select 
>>>>>>>> instead of the default best = kqueue but the results were the same; 
>>>>>>>> --with-iopoll=epoll didn't work because epoll is not available on this 
>>>>>>>> machine.)  The client, initiated by the command "openssl s_client 
>>>>>>>> -connect SERVER:993", first responds "CONNECTED(00000003)" but then 
>>>>>>>> immediately the error "60278:error:140790E5:SSL 
>>>>>>>> routines:SSL23_WRITE:ssl handshake 
>>>>>>>> failure:/SourceCache/OpenSSL098/OpenSSL098-44/src/ssl/s23_lib.c:182:". 
>>>>>>>>  The infinite loop is in src/lib/ioloop.c in the function 
>>>>>>>> "io_loop_run" where the statement "while (ioloop->running) 
>>>>>>>> io_loop_handler_run(ioloop)" is executed.
>>> 



Reply via email to