At file:///home/pqm/archives/thelove/bzr/%2Btrunk/ ------------------------------------------------------------ revno: 4072 revision-id: [email protected] parent: [email protected] parent: [email protected] committer: Canonical.com Patch Queue Manager <[email protected]> branch nick: +trunk timestamp: Tue 2009-03-03 04:24:09 +0000 message: (mbp) hpss streaming design docs modified: doc/developers/network-protocol.txt networkprotocol.txt-20070903044232-woustorrjbmg5zol-1 ------------------------------------------------------------ revno: 3944.1.1 revision-id: [email protected] parent: [email protected] committer: Martin Pool <[email protected]> branch nick: doc-hpss timestamp: Mon 2009-01-19 21:14:14 +1100 message: Notes with Andrew about hpss streaming modified: doc/developers/network-protocol.txt networkprotocol.txt-20070903044232-woustorrjbmg5zol-1 === modified file 'doc/developers/network-protocol.txt' --- a/doc/developers/network-protocol.txt 2008-05-16 07:15:57 +0000 +++ b/doc/developers/network-protocol.txt 2009-01-19 10:14:14 +0000 @@ -2,7 +2,7 @@ Network Protocol ================ -:Date: 2007-09-03 +:Date: 2009-01-07 .. contents:: @@ -221,19 +221,24 @@ The underlying message format is:: - MESSAGE := "bzr message 3 (bzr 1.6)" NEWLINE HEADERS MESSAGE_PARTS + MESSAGE := MAGIC NEWLINE HEADERS CONTENTS END_MESSAGE + MAGIC := "bzr message 3 (bzr 1.6)" HEADERS := LENGTH_PREFIX bencoded_dict - MESSAGE_PARTS := MESSAGE_PART [MORE_MESSAGE_PARTS] - MORE_MESSAGE_PARTS := END_MESSAGE_PARTS | MESSAGE_PARTS - END_MESSAGE_PARTS := "e" + END_MESSAGE := "e" + BODY := MESSAGE_PART+ MESSAGE_PART := ONE_BYTE | STRUCTURE | BYTES ONE_BYTE := "o" byte STRUCTURE := "s" LENGTH_PREFIX bencoded_structure BYTES := "b" LENGTH_PREFIX bytes +(Where ``+`` indicates one or more.) + This format allows an arbitrary sequence of message parts to be encoded -in a single message. +in a single message. The contents of a MESSAGE have a higher-level +message, but knowing just this amount of data it's possible to +deserialize and consume a message, so that implementations can respond to +messages sent by later versions. Headers ~~~~~~~ @@ -254,36 +259,54 @@ describes how such messages are encoded. All requests and responses defined by earlier protocol versions must be encoded in this way. -Conventional requests will send a sequence of: - -* Arguments (a STRUCTURE of a tuple) - -* (Optional) body - - * Single body (BYTES), or - - * Streamed body (multiple BYTES parts), followed by a status (ONE_BYTE) - - * if status is "E", followed by an Error (STRUCTURE) - -Conventional responses will send a sequence of: - -* Status (ONE_BYTE) - -* Arguments (a STRUCTURE of a tuple) - -* (Optional) body - - * Single body (BYTES), or - - * Streamed body (multiple BYTES parts), followed by a status (ONE_BYTE) - - * if status is "E", followed by an Error (STRUCTURE) - -In all cases, the ONE_BYTE status is either "S" for Success or "E" for -Error. Note that the streamed body from version two is now just multiple +Conventional requests will send a CONTENTS of :: + + CONV_REQ := ARGS SINGLE_OR_STREAMED_BODY? + SINGLE_OR_STREAMED_BODY := BYTES + | BYTES+ TRAILER + + ARGS := STRUCTURE(argument_tuple) + TRAILER := SUCCESS_STATUS | ERROR + SUCCESS_STATUS := ONE_BYTE("S") + ERROR := ONE_BYTE("E") STRUCTURE(argument_tuple) + +Conventional responses will send CONTENTS of :: + + CONV_RESP := RESP_STATUS ARGS SINGLE_OR_STREAMED_BODY? + RESP_STATUS := ONE_BYTE("S") | ONE_BYTE("E") + +If the RESP_STATUS is success ("S"), the arguments are the +method-dependent result. + +For errors (where the Status byte of a response or a streamed body is +"E"), the situation is analagous to requests. The first item in the +encoded sequence must be a string of the error name. The other arguments +supply details about the error, and their number and types will depend on +the type of error (as identified by the error name). + +Note that the streamed body from version two is now just multiple BYTES parts. +The end of the request or response is indicated by the lower-level +END_MESSAGE. If there's only one BYTES element in the body, the TRAILER +may or may not be present, depending on whether it was sent as a single +chunk or as a stream that happens to have one element. + + *(Discussion)* The success marker at the end of a streamed body seems + redundant; it doesn't have space for any arguments, and the end of the + body is marked anyhow by the end of the message. Recipients shouldn't + take any action on it, though they should map an error into raising an + error locally. + + 1.10 clients don't assert that they get a status byte at the end of the + message. They will complain (in + ``ConventionalResponseHandler.byte_part_received``) if they get an + initial success and then another byte part with no intervening bytes. + If we stop sending the final success message and only flag errors + they'll only get one if the error is detected after streaming starts but + before any bytes are actually sent. Possibly we should wait until at + least the first chunk is ready before declaring success. + For new methods, these sequences are just a convention and may be varied if appropriate for a particular request or response. However, each request should at least start with a STRUCTURE encoding the arguments @@ -292,11 +315,105 @@ bencoded. As a result, unlike previous protocol versions, arguments in this version are 8-bit clean.) -For errors (where the Status byte of a response or a streamed body is -"E"), the situation is analagous to requests. The first item in the -encoded sequence must be a string of the error name. The other arguments -supply details about the error, and their number and types will depend on -the type of error (as identified by the error name). + (Discussion) We're discussing having the byte segments be not just a + method for sending a stream across the network, but actually having them + be preserved in the rpc from end to end. This may be useful when + there's an iterator on one side feeding in to an iterator on the other, + if it avoids doing chunking and byte-counting at two levels, and if + those iterators are a natural place to get good granularity. Also, for + cases like ``insert_record_stream`` the server can't do much with the + data until it gets a whole chunk, and so it'll be natural and efficient + for it to be called with one chunk at a time. + + On the other hand, there may be times when we've got some bytes from the + network but not a full chunk, and it might be worthwhile to pass it up. + If we promise to preserve chunks, then to do this we'd need two separate + streaming interfaces: "we got a chunk" and "we got some bytes but not + yet a full chunk". For ``insert_record_stream`` the second might not be + useful, but it might be good when writing to a file where any number of + bytes can be processed. + + If we promise to preserve chunks, it'll tend to make some RPCs work only + in chunks, and others just on whole blocks, and we can't so easily + migrate RPCs from one to the other transparently to older + implementations. + + The data inside those chunks will be serialized anyhow, and possibly the + data inside them will already be able to be serialized apart without + understanding the chunks. Also, we might want to use these formats e.g. + for pack files or in bundles, and so they don't particularly need + lower-level chunking. So the current (unmerged, unstable) record stream + serialization turns each record into a bencoded tuple and it'd be + feasible to parse one tuple at a time from a byte stream that contains a + sequence of them. + + So we've decided that the chunks won't be semantic, and code should not + count on them being preserved from client to server. + +Early error returns +~~~~~~~~~~~~~~~~~~~ + + *(Discussion)* It would be nice if the server could notify the client of + errors even before a streaming request has finished. This could cover + situtaions such as the server not understanding the request, it being + unable to open the requested location, or it finding that some of the + revisions being sent are not actually needed. + + Especially in the last case, we'd like to be able to gracefully notice + the condition while the client is writing, and then have it adapt its + behaviour. In any case, we don't want to have drop and restart the + network stream. + + It should be possible for the client to finish its current chunk and + then its message, possibly with an error to cancel what's already been + sent. + + This relies on the client being able to read back from the server while + it's writing. This is technically difficult for http but feasible over + a socket or ssh. + + We'd need a clean way to pass this back to the request method, even + though it's presumably in the middle of doing its body iterator. + Possibly the body iterator could be manually given a reference to the + request object, and it can poll it to see if there's a response. + + Perhaps we need to distinguish error conditions, which should turn into + a client-side error regardless of the request code, from early success, + which should be handled only if the request code specifically wants to + do it. + +Full-duplex operation +~~~~~~~~~~~~~~~~~~~~~ + + Code not geared to do pipelined requests, and this might require doing + asynchrony within bzrlib. We might want to either go fully pipelined + and asynchronous, but there might be a profitable middle ground. + + The particular case where duplex communication would be good is in + working towards the common points in the graphs between the client and + server: we want to send speculatively, but detect as soon as they've + matched up. + + So we could for instance have a synchronous core, but rely on the OS + network buffering to allow us to work on batches of say 64kB. We can + also pipeline requests and responses, without allowing for them + happening out of order, or mixed requests happening at the same time. + + Wonder how our network performance would have turned out now if we'd + done full-duplex from the start, and ignored hpss over http. We have + pretty good (readonly) http support just over dumb http, and that may be + better for many users. + + + +APIs +==== + +On the client, the bzrlib code is "in charge": when it makes a request, or +asks from data from the network, that causes network IO. The server is +event driven: the network code tells the response handler when data has +been received, and it takes back a Response object from the request +handler that is then polled for body stream data. Paths =====
-- bazaar-commits mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/bazaar-commits
