Hello, I have a client/server networking application that exhibits
TCP socket handling errors. This only happens on FreeBSD, while NetBSD,
Linux, Solaris, etc. all seem to work correctly. I was hoping to get
some advice on what could be the root cause.

I have two processes - client and server, sending and receiving data
to/from each other on 127.0.0.1

Client connects to server and calls send(2)/recv(2) in a loop. This is
a bidirectional data exchange. When all send data is transferred,
client calls shutdown(sockfd, SHUT_WR) and continues receiving data on
the same socket until recv(2) returns 0 bytes, which signals end of
receive data. At this stage client calls close(sockfd) and terminates.

Server has the same data transfer loop as the client.

I frequently get ECONNRESET when calling close(2), sometimes from the
server and sometimes from the client process. This should not be
happening, but I'm not sure what could be causing it.

The client logic is as follows:

1. Set sockfd nonblocking.
2. Call send(2)/recv(2) in a loop until N bytes have been transferred in each 
direction.
3. Set sockfd blocking.
4. Call send_buf() to send control handshake to server.
5. Call shutdown(sockfd, SHUT_WR) to signal end of send data from client.
6. Call recv_buf() to receive control handshake from server.
7. Call recv_buf() and verify it returned 0 bytes to indicate end of data from 
server.
8. Call close(sockfd) and verify success.

Step 8 sometimes fails and returns ECONNRESET.

Functions send_buf() and recv_buf() are wrappers around send(2) and
recv(2) which restart those system calls until the specified number of
buffer bytes have been fully transferred or 0 is returned in the case
of recv_buf() indicating end of data. They are designed to work with
blocking file descriptors and avoid short reads/writes.

I don't understand why close(2) sometimes returns ECONNRESET when the
previous recv(2) call at step 7 returned 0 bytes, indicating the remote
TCP end sent us a FIN.

I don't set SO_LINGER socket option and when I checked the default on
FreeBSD it reports l_onoff=0, l_linger=0 so there should be no
immediate RST on socket close(2).

Does anyone have any suggestions or ideas?

Thanks.

Reply via email to