On Sat, 1 Apr 2017, Otto Moerbeek wrote:
> On Sat, Apr 01, 2017 at 01:20:25AM -0500, Luke Small wrote:
> 
> > Here are two programs. They both fork two clients that try to connect 
> > on port 'portno' that is listened to in main(). It spawns a receive 
> > that receives passed file descriptors passed from main(). It passes a 
> > file descriptor of the client connection to receive twice. 
> > server_sample0.c uses the man page code. server_sample.c uses my 
> > example. the former fails to pass the file descriptor on the second 
> > try. the latter succeeds both times. I don't think you have any more 
> > questions.
> 
> Well, it took some effort to get this out of you.
> 
> It seems that a problem exists indeed. It happens when the iovec data is 
> not filled in, as the example uses. recv(2) only returns the number of 
> bytes read from the iovec, so there seems to be some confusion about a 
> failed read and a read of zero bytes. I'll check with the standard what 
> is supposed to happen when only auciliary data is sent.

Well, it's not a failed read: recvmsg() returns 0, not -1.

The issue is that in the kernel socket receive buffer, control messages 
from a single send are always followed by a normal data buffer.  In kernel 
terms, an MT_CONTROL mbuf chain is always followed by an MT_DATA mbuf 
chain...even when there's no data sent.  In that case, the MT_DATA mbuf 
has length zero.

This works fine on the sending side, but when recvmsg() finishes with the 
control messages and gets to the data buffer, it thinks it's done, as the 
caller requested that nothing be copied out and it doesn't remove the 
zero-length MT_DATA mbuf, leaving it at the head of the socket buffer.  
Succeeding calls see no control messages at the start and then again do 
nothing to the data buffer.

Note this is specific to SOCK_STREAM sockets: the boundary preserving 
behavior of SOCK_DGRAM and SOCK_SEQPACKET mean it doesn't happen there.

To avoid this, the application doesn't need to *send* any data, it just 
needs to always accept at least one byte of data when calling recvmsg().  
Even if there's no data there, that acceptance of more than zero bytes of 
data will let recvmsg() peel off the zero-length MT_DATA mbuf from the 
send.


I guess the questions then are
1) is this a bug?  can it be fixed? and
2) if not, should it be documented and where?

On the latter, I'm not convinced the example code in CMSG_DATA(3) is the 
place to do so.  If we deleted all the EXAMPLES sections from manpages 
they should be merely more difficult to understand, not incomplete.  More 
importantly, this behavior isn't related to the CMSG_* macros at all, but 
rather to recvmsg(2) itself.  Maybe a cavest there?


Philip Guenther

Reply via email to