Re: _bulk_get protocol extension

Jens Alfke Mon, 27 Jan 2014 21:13:57 -0800

On Jan 27, 2014, at 7:26 PM, Yaron Goland <[email protected]> wrote:


> Nevertheless he did say that so long as one probes the connection then 
> pipelining is known to work. Probing just means that you can't assume that 
> the server you are talking to is a 1.1 server and therefore supports 
> pipelining.

Well, yes, that's pretty clear — I mean, I know pipelining's been implemented. 
(And on iOS and Mac the frameworks already know how to support pipelining, so 
one doesn't have to do the probing oneself.)

The problems with pipelining are higher level than that. Did you read the text 
by Ilya Grigorik that I linked to? Here's another excerpt:

        • A single slow response blocks all requests behind it.
        • When processing in parallel, servers must buffer pipelined responses, 
which may exhaust server resources—e.g., what if one of the responses is very 
large? This exposes an attack vector against the server!
        • A failed response may terminate the TCP connection, forcing the 
client to re-request all the subsequent resources, which may cause duplicate 
processing.
        • Detecting pipelining compatibility reliably, where intermediaries may 
be present, is a nontrivial problem.
        • Some intermediaries do not support pipelining and may abort the 
connection, while others may serialize all requests.
— http://chimera.labs.oreilly.com/books/1230000000545/ch11.html#HTTP_PIPELINING

(Now, HTTP 2.0 is adding multiplexing, which alleviates most of those problems. 
I'll be happy when we get to use it, but that probably won't be for a year or 
two at least.)

I also mentioned the overhead of issuing a bunch of HTTP requests versus just 
one. As a thought experiment, consider fetching a one-megabyte HTTP resource by 
using a thousand byte-range GET requests each requesting 1K of the file. Would 
this take longer than issuing a single GET request for the entire resource? 
Yeah, and probably a lot longer, even with pipelining. The client and the 
server both introduce overhead in handling requests.

Finally, consider that putting a number of related resources together into a 
single body enables better compression, since general-purpose compression 
algorithms look for repeated patterns. If I have a thousand small documents 
each of which contains a property named "this_is_my_custom_property", then if 
all those documents are returned in one response each instance of that string 
will get compressed down to a very short token. If they're separate responses, 
the string won't get compressed.

—Jens

Re: _bulk_get protocol extension

Reply via email to