Frase,
It sounds like your endpoint is terminating the connection before accept
has had a chance to 'accept' the TCP connection and return a valid
socket descriptor.
Have you tried using tcpdump to determine the TCP handshakes that are
going on (tcpdump dst port 5672)
Clive
On 22/05/2012 18:59, Fraser Adams wrote:
Hi All,
on one of our brokers my colleagues have started seeing errors of the
form:
error could not accept socket: Transport endpoint is not connected
(qpid/sys/posix/Socket.cpp 58)
Does anybody have a good idea what is likely to be causing this??
The broker appears to be functioning and there are only a modest
number of connections to it, we'd even upped the default connection
limit "just in case" but the number of connections was way lower than
that (only a couple of dozen or so).
Because the error seemed to be relating to "accept" I did wonder if
there was a backlog issue, but it's not as if we seem to be having
lots of things trying to connect at once so it shouldn't be an issue -
nonetheless we tried upping the backlog "just in case" again it
doesn't seem to have changed things.
We still appear to be able to connect to the broker and use the qpid
tools and even created federated links, but I don't like seeing this
error. I've got no idea what it relates to and can't see what we might
be doing wrong.
This is in a qpid 0.8 C++ broker.
I had a look through Socket.cpp and the error seems to be kicked off
in getName()
int result = -1;
if (local) {
result = ::getsockname(fd, (::sockaddr*)&name, &namelen);
} else {
result = ::getpeername(fd, (::sockaddr*)&name, &namelen);
}
QPID_POSIX_CHECK(result);
the QPID_POSIX_CHECK is line 58 which is mentioned in the error message.
Looking through man for getsockname and getpeername getsockname
doesn't seem to have relevant errors but getpeername mentions:
ENOTCONN
The socket is not connected.
Which looks to be a likely place for the error to have originated.
So this looks like it was a result of a remote (to the broker)
connection endpoint not being present but I'm not at all sure what
circumstances would cause this in a qpid broker. The brokers have been
restarted and the host has been rebooted and we're still seeing these
messages appear regularly (every few seconds) we don't have anything
(knowingly!!) persisted.
Some more digging suggests that Socket::getName seems to be called by
getPeername and getPeerAddress but grepping getPeername seems to
indicate that nothing actually calls getPeername, doing the same for
getPeerAddress indicates qpid/client/TCPConnector.cpp as the most
likely thing to be calling it, but I'm not an expert in the code base
and this would seem to be odd as the error is (I believe) in the
broker logs.
I'm wondering if there's something fishy up with a federated
connection, I'm guessing that could potentially show up as a client
error presumably a source route might be implemented as if it were a
client?
I'd love to know if anyone else has seen this.
regards,
Frase
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]