On Sat, 1 Apr 2017, Otto Moerbeek wrote: > On Sat, Apr 01, 2017 at 01:20:25AM -0500, Luke Small wrote: > > > Here are two programs. They both fork two clients that try to connect > > on port 'portno' that is listened to in main(). It spawns a receive > > that receives passed file descriptors passed from main(). It passes a > > file descriptor of the client connection to receive twice. > > server_sample0.c uses the man page code. server_sample.c uses my > > example. the former fails to pass the file descriptor on the second > > try. the latter succeeds both times. I don't think you have any more > > questions. > > Well, it took some effort to get this out of you. > > It seems that a problem exists indeed. It happens when the iovec data is > not filled in, as the example uses. recv(2) only returns the number of > bytes read from the iovec, so there seems to be some confusion about a > failed read and a read of zero bytes. I'll check with the standard what > is supposed to happen when only auciliary data is sent.
Well, it's not a failed read: recvmsg() returns 0, not -1. The issue is that in the kernel socket receive buffer, control messages from a single send are always followed by a normal data buffer. In kernel terms, an MT_CONTROL mbuf chain is always followed by an MT_DATA mbuf chain...even when there's no data sent. In that case, the MT_DATA mbuf has length zero. This works fine on the sending side, but when recvmsg() finishes with the control messages and gets to the data buffer, it thinks it's done, as the caller requested that nothing be copied out and it doesn't remove the zero-length MT_DATA mbuf, leaving it at the head of the socket buffer. Succeeding calls see no control messages at the start and then again do nothing to the data buffer. Note this is specific to SOCK_STREAM sockets: the boundary preserving behavior of SOCK_DGRAM and SOCK_SEQPACKET mean it doesn't happen there. To avoid this, the application doesn't need to *send* any data, it just needs to always accept at least one byte of data when calling recvmsg(). Even if there's no data there, that acceptance of more than zero bytes of data will let recvmsg() peel off the zero-length MT_DATA mbuf from the send. I guess the questions then are 1) is this a bug? can it be fixed? and 2) if not, should it be documented and where? On the latter, I'm not convinced the example code in CMSG_DATA(3) is the place to do so. If we deleted all the EXAMPLES sections from manpages they should be merely more difficult to understand, not incomplete. More importantly, this behavior isn't related to the CMSG_* macros at all, but rather to recvmsg(2) itself. Maybe a cavest there? Philip Guenther
