On Thu, 18 Feb 2010, Mikolaj Golub wrote:

Below is a simple test code with unix sockets: the client does connect()/close() in loop and the server -- accept()/close().

Sometimes close() fails with 'Socket is not connected' error:

Hi Mikolaj:

Thanks for this report, and sorry about not spotting your earlier post to freebsd-net. I've been fairly preoccupied the last month and not keeping up with the mailing lists. Could I ask you to file a PR on this, and forward me the PR number so I can claim ownership? This should prevent it from getting lost while I catch up.

In short, your evaluation seems reasonable to me -- have you tried tweaking soclose() to ignore ENOTCONN from sodisconnect() to confirm this diagnosis fixes all the instances you've been seeing?

Robert N M Watson
Computer Laboratory
University of Cambridge



a.out: parent: close error: 57

or

a.out: child: close error: 57

It looks for me like some race in close(). Looking at uipc_socket.c:soclose():

int
soclose(struct socket *so)
{
       int error = 0;

       KASSERT(!(so->so_state & SS_NOFDREF), ("soclose: SS_NOFDREF on enter"));

       CURVNET_SET(so->so_vnet);
       funsetown(&so->so_sigio);
       if (so->so_state & SS_ISCONNECTED) {
               if ((so->so_state & SS_ISDISCONNECTING) == 0) {
                       error = sodisconnect(so);
                       if (error)
                               goto drop;
               }

Isn't the problem here? so_state is checked for SS_ISCONNECTED and
SS_ISDISCONNECTING without locking and then sodisconnect() is called, which
closes both sockets of the connection. So it looks for me that if the close()
is called for both ends simultaneously it is possible that sodisconnect() will
be called for both ends and for one ENOTCONN will be returned. Or may I have
missed something?

We have been observing periodically ENOTCONN errors on unix socket close in
our applications, so it is not just curiosity :-) (I posted about our problem
to freebsd-net@ some time ago but then did not attract any attention
http://lists.freebsd.org/pipermail/freebsd-net/2009-December/024047.html).

#include <sys/types.h>
#include <sys/socket.h>
#include <sys/un.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <strings.h>
#include <string.h>
#include <unistd.h>
#include <sys/select.h>
#include <err.h>

#define UNIXSTR_PATH "/tmp/mytest.socket"
#define USLEEP  100

int main(int argc, char **argv)
{
        int                     listenfd, connfd, pid;
        struct sockaddr_un      servaddr;

        pid = fork();
        if (-1 == pid)
                errx(1, "fork(): %d", errno);

        if (0 != pid) { /* parent */

                if ((listenfd = socket(AF_LOCAL, SOCK_STREAM, 0)) < 0)
                        errx(1, "parent: socket error: %d", errno);

                unlink(UNIXSTR_PATH);
                bzero(&servaddr, sizeof(servaddr));
                servaddr.sun_family = AF_LOCAL;
                strcpy(servaddr.sun_path, UNIXSTR_PATH);

                if (bind(listenfd, (struct sockaddr *) &servaddr, 
sizeof(servaddr)) < 0)
                        errx(1, "parent: bind error: %d", errno);

                if (listen(listenfd, 1024) < 0)
                        errx(1, "parent: listen error: %d", errno);

                for ( ; ; ) {
                        if ((connfd = accept(listenfd, (struct sockaddr *) NULL, 
NULL)) < 0)
                                errx(1, "parent: accept error: %d", errno);

                        //usleep(USLEEP / 2); // (I) uncomment this or (II) 
below to avoid the race

                        if (close(connfd) < 0)
                                errx(1, "parent: close error: %d", errno);
                }

        } else { /* child */

                sleep(1); /* give the parent some time to create the socket */

                for ( ; ; ) {

                        if ((connfd = socket(AF_LOCAL, SOCK_STREAM, 0)) < 0)
                                errx(1, "child: socket error: %d", errno);

                        bzero(&servaddr, sizeof(servaddr));
                        servaddr.sun_family = AF_LOCAL;
                        strcpy(servaddr.sun_path, UNIXSTR_PATH);

                        if (connect(connfd, (struct sockaddr *) &servaddr, 
sizeof(servaddr)) < 0)
                                errx(1, "child: connect error %d", errno);

                        // usleep(USLEEP); // (II) uncomment this or (I) above 
to avoid the race

                        if (close(connfd) != 0)
                                errx(1, "child: close error: %d", errno);

                        usleep(USLEEP);
                }
        }

        return 0;
}

--
Mikolaj Golub
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[email protected]"

_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[email protected]"

Reply via email to