On Sat, May 28, 2011 at 05:35:22PM +0100, Alex Bligh wrote: > Goswin, > > --On 28 May 2011 16:37:12 +0200 Goswin von Brederlow <[email protected]> > wrote: > > My view is that this is derived from the linux request layer, in > which case (having asked much the same question on fsdevel > a couple of days ago) the answers appear to be as follows: > > > 1) Order of replies > > > > Currently nbd-server works all requests in order and replies in > > order. Since every request/reply has a handle to uniquely pair them I > > assume replying to requests out of order is allowed and will (most > > likely) be handled correctly by existing clients. > > Handles can be reused only once the command in question is completed. > > You may process commands out of order, and reply out of order, > save that > a) all write commands *completed* before you process a REQ_FLUSH > must be written to non-volatile storage prior to completing > that REQ_FLUSH (though apparently you should, if possible, make > this true for all write commands *received*, which is a stronger > condition) [Ignore this if you don't set SEND_REQ_FLUSH]
We already implement that stronger condition, because writes are handled in the way they are received. It shouldn't be too hard to implement when disordered handling of requests is done, either: stop handling incoming requests when you receive a flush request; flag all outstanding requests so you know when the flush can be done (after which you can start handling incoming requests again); and handle the flush when all flagged requests have been handled. [...] > > 2) Overlapping requests > > > > I assume that requests may overlap. For example a client may write a > > block of data and read it again before the write was ACKed. This would > > be unexpected behaviour from a proper client but not forbidden. > > Correct > > > As such > > the server has to internally ensure the proper order of overlapping > > requests. > > Slightly surprisingly, the fsdevel folk's answer to this is that you > can disorder both reads and writes and do what is natural, i.e. do > not maintain ordering. A file system which cares about the result > should not issue reads of blocks for which the writes have not > completed. Interesting to know. [...] > > + not NBD_CMD_FLAG_FUA: > > a) reply when the data has been recieved > > b) reply when the data has been commited to cache (write() returned) > > c) reply when the data has been commited to physical medium > > You may do any of those. Provided you will write the data "eventually" > (i.e. when you receive a REQ_FLUSH or a disconnect). > > > For a+b how does one report write errors that only appear after > > the reply? Report them in the next FLUSH request? > > You don't. To be safe, I'd error every write (i.e. turn the medium > read only). I don't think errors that appear after the reply are possible in the case of b (they are in the case of a, obviously)? Or what am I missing? [...] > > * NBD_CMD_DISC: Wait for all pending requests to finish, close socket > > You should reply to all pending requests prior to closing the socket > I believe, mostly as it's polite. I believe the current client doesn't > send a disconnect until all replies are in, I believe so too, yes. [...] > and I also think the server may behave a little badly here. How so? > > Should this flush data before closing the socket? And if so what if > > there is an error on flush? I guess clients should send NBD_CMD_FLUSH > > prior to NBD_CMD_DISC if they care. > > No, you should not rely on this happening. Even umount of an ext2 volume > will not send NBD_FLUSH where kernel, client, and server support it. > You don't need to write it then and there (in fact there is no 'then > and there' as an NBD_CMD_DISC has no reply), It does have one -- the FIN packet. But yeah, it's not an application-layer reply, that much is true. > but you cannot guarantee *at all* that you will have received any sort > of flush under any circumstances. Correct. All you know is that the server will close its file handles on disconnect. > > What if there are more requests after this while waiting for pending > > requests to finish? Should they be ignored or return an error? > > I believe it is an, um, undocumented implicit assumption that no > commands are sent after NBD_CMD_DISC is sent. The current server > just closes the socket, which will probably result in an EPIPE > upstream if the FIN packet gets back before these other commands > are written. The client will flush its outgoing queue before sending a disconnect request. Indeed, if it didn't do that, badness would ensue. [...] -- The volume of a pizza of thickness a and radius z can be described by the following formula: pi zz a ------------------------------------------------------------------------------ vRanger cuts backup time in half-while increasing security. With the market-leading solution for virtual backup and recovery, you get blazing-fast, flexible, and affordable data protection. Download your free trial now. http://p.sf.net/sfu/quest-d2dcopy1 _______________________________________________ Nbd-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nbd-general
