Hi All,
on one of our brokers my colleagues have started seeing errors of the form:

error could not accept socket: Transport endpoint is not connected (qpid/sys/posix/Socket.cpp 58)

Does anybody have a good idea what is likely to be causing this??

The broker appears to be functioning and there are only a modest number of connections to it, we'd even upped the default connection limit "just in case" but the number of connections was way lower than that (only a couple of dozen or so).

Because the error seemed to be relating to "accept" I did wonder if there was a backlog issue, but it's not as if we seem to be having lots of things trying to connect at once so it shouldn't be an issue - nonetheless we tried upping the backlog "just in case" again it doesn't seem to have changed things.

We still appear to be able to connect to the broker and use the qpid tools and even created federated links, but I don't like seeing this error. I've got no idea what it relates to and can't see what we might be doing wrong.

This is in a qpid 0.8 C++ broker.

I had a look through Socket.cpp and the error seems to be kicked off in getName()

    int result = -1;
    if (local) {
        result = ::getsockname(fd, (::sockaddr*)&name, &namelen);
    } else {
        result = ::getpeername(fd, (::sockaddr*)&name, &namelen);
    }

    QPID_POSIX_CHECK(result);

the QPID_POSIX_CHECK is line 58 which is mentioned in the error message.


Looking through man for getsockname and getpeername getsockname doesn't seem to have relevant errors but getpeername mentions:

       ENOTCONN
              The socket is not connected.

Which looks to be a likely place for the error to have originated.


So this looks like it was a result of a remote (to the broker) connection endpoint not being present but I'm not at all sure what circumstances would cause this in a qpid broker. The brokers have been restarted and the host has been rebooted and we're still seeing these messages appear regularly (every few seconds) we don't have anything (knowingly!!) persisted.

Some more digging suggests that Socket::getName seems to be called by getPeername and getPeerAddress but grepping getPeername seems to indicate that nothing actually calls getPeername, doing the same for getPeerAddress indicates qpid/client/TCPConnector.cpp as the most likely thing to be calling it, but I'm not an expert in the code base and this would seem to be odd as the error is (I believe) in the broker logs.

I'm wondering if there's something fishy up with a federated connection, I'm guessing that could potentially show up as a client error presumably a source route might be implemented as if it were a client?

I'd love to know if anyone else has seen this.

regards,
Frase










---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to