Gary Mills wrote: > It's always for the data connection, with the process sleeping > in read() on that socket. I assume it's waiting for a FIN from > the client. Shouldn't this state time out?
As described in RFC 793, no, it should not time out. That's the half-open state: we've sent FIN to the peer, and he's ACK'd our FIN, and we're waiting for him to send FIN. He could easily send an arbitrary amount of additional data to us before closing, so we can't just time out -- especially with the application still waiting in read(). When one side is in FIN-WAIT-2 state, the peer should be in CLOSE-WAIT state, which means that the peer's TCP implementation is waiting for the application to close the socket. The application may still issue write() before doing so. One thing that can cause such a problem (besides just normal use of half-open TCP connections) is a descriptor leak. If the peer is forking and forgetting about its open descriptors, it can accidentally hold them open, resulting in exactly this sort of behavior. > # ndd /dev/tcp tcp_fin_wait_2_flush_interval > 675000 > > It never does. Is the server supposed to take some action? > All of the timeouts are set in the ftpaccess file. Is there a bug > in the Solaris kernel? I can't find one that's documented. > > This is running under Solaris 10. I can open a support case, > but I'd like to get a little more information first. That timer does something only if the socket is detached -- meaning that the local application has issued close() on all of its open copies of that descriptor, and not if it has just used shutdown(,1). If the local application has issued close(), and if the socket hangs around for more than 675 seconds then that sounds like a bug. But since the application is sitting in read(), the timer has no effect, and can't have an effect. It'd break TCP if it did. -- James Carlson 42.703N 71.076W <carls...@workingcode.com> _______________________________________________ networking-discuss mailing list networking-discuss@opensolaris.org