Goswin, --On 29 May 2011 16:03:18 +0200 Goswin von Brederlow <[email protected]> wrote:
>> I'm not sure what you mean by "synchronously". The current client issues >> requests and processes replies asynchronously, i.e. there may be more >> than one outstanding. > > The server doesn't, currently. So it will have replied to all requests > before it reads the NBD_CMD_DISC. With the current server. However, despite a reply being queued in respect of all preceding requests, it may be that not all those of those packets have reached the wire. They may be in a socket buffer. > If the server where asynchronous then yes. Now there might be an > in-flight request but it will be completed before the server dies. And I > believe that behaviour should remain. I think you are misunderstanding the problem. A client sends a normal command, followed by a disconnect (which is legal). The server then accepts the normal command, processes it, and sends a reply (by which I mean "does a write() to the socket". The server then processes the NBD_CMD_DISC, which does a close() on the socket and exits the process. My concern is that this does not necessarily mean that the data enqueued in the socket buffer is in fact sent on the wire. This does not affect the current client in normal operations because it does not send the NBD_CMD_DISC until all replies have been received; however, this is a matter of observation, and not of coding guarantee. All I am saying is we should take action to ensure we do send any queued replies (queued here mean "processed and in the socket's SNDBUF") prior to any action which might cause them to be junked. If you look at Stephens UNIX Network Programming p202, on SO_LINGER, there are 3 possible behaviours. 1. l_onoff is 0: l_linger is ignored, close returns immediately and the server wil try to deliver the data to the peer. 2. l_onoff is nonzero, and l_linger is zero. TCP aborts the connection when it is closed: ***that is TCP discards any data still remaining in the socket send buffer and sends an RST to the peer*** (this is what we should avoid). 3. l_onoff is non-zero and l_linger is zero: the kernel lingers, i.e. close blocks until either the data is sent and acknowledged or the linker time expires (which one happens is returned in the error code). (this is OK provided we use a sufficient linger time). That implies that l_onoff=0 l_linger=0 would be fine. However, under (e.g.) SVR4, close() can lose data. See (e.g.): http://tinyurl.com/odlj5 > Closing a socket: if SO_LINGER has not been called on a socket, then > close() is not supposed to discard data. This is true on SVR4.2 (and, > apparently, on all non-SVR4 systems) but apparently not on SVR4; the use > of either shutdown() or SO_LINGER seems to be required to guarantee > delivery of all data. In general terms it is not sufficient to rely solely on close() to send data in portable programs. >>> Here is what i think should happen: >>> - on recieving a NBD_CMD_DISC request you shutdown(fd, SHUT_RD) >>> - process and reply any pending requests >>> - fsync() /* implicit flush, just to be nice */ >>> - shutdown(fd, SHUT_WR) >>> - close(fd) >>> - exit() >> >> That alone doesn't help (I am not sure we do the shutdown but >> it might be an improvement). I missed the shutdown(SHUT_WR) here - that will indeed do what is required. What I was saying is the shutdown(SHUT_RD) is unnecessary and insufficient (see below for why). I presume you mean fsync() the backing store (as opposed to the socket, here) - yes, I think that's a good idea. So I would do: fsync(backing store) shutdown(socket, SHUT_WR) close(fd) exit() > It (or close) blocks with SO_LINGER set or goes into background > otherwise. Also lingering is allways done on exit. I can't find a reference to the assertion that there is automatic lingering on a process exit on all platforms. My understanding is that exiting a process does an implicit close on fds, and that's all, but I migh be wrong. >> From all google can find me the kernel will still try to send any > remaining data in the outgoing socket buffer till the SO_LINGER timeout > or tcp timeout kills the socket for good. There seem to be no way to > "wait for all data to be send" on a socket prior to closing it. Sure, but nbd-server is a portable program. >>> I was thinking of a buggy or malicious client. Say there is a bug in the >>> linux kernel so it sends the NBD_CMD_DISC followed by a NBD_CMD_READ. >>> Then we tear down the connection and never reply to the READ. Is that >>> better than replying with an error to the READ? >> >> We tear down the client, and exit the process. The socket is closed, >> so the client will get EPIPE. If the client is buggy or malicious, >> that's no better than it deserves! > > You forget the network latenz. The client can send additional comands > before the server can tear down the socket. Remember the assumption is a > buggy or malicious client. Clearly a correct client should never ever > send anything after NBD_CMD_DISC. It should probably even shutdown the > writing side of its socket. I don't think network latency comes into it. The NBD_CMD_DISC is sent by the client. Unix stream sockets (and TCP) are ordered, therefore anything the client sends after the _DISC will be received after the _DISC is received. On receipt of the _DISC, the server closes the socket before it does another select() or has any chance of reading anything else. The close() will discard any data in RCVBUF. So I don't think a buggy client can ever cause any damage by sending data after NBD_CMD_DISC. Doing a shutdown SHUT_RD doesn't really help because the the buggy client might have sent more commands after the NBD_CMD_DISC before you get to the shutdown(); indeed they might be in the same tcp packet. -- Alex Bligh ------------------------------------------------------------------------------ vRanger cuts backup time in half-while increasing security. With the market-leading solution for virtual backup and recovery, you get blazing-fast, flexible, and affordable data protection. Download your free trial now. http://p.sf.net/sfu/quest-d2dcopy1 _______________________________________________ Nbd-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nbd-general
