On 07 Oct 2015, at 4:30 PM, Stefan Eissing <stefan.eiss...@greenbytes.de> wrote:

>> Can you describe how cleanups occur in the http2 world?
> 
> In http2 land, the request happen on "pseudo" connections, connections 
> properly created by ap_run_create_connection(), but with own filters of type 
> AP_FTYPE_PROTOCOL and AP_FTYPE_NETWORK, registered by mod_h2. 

In theory, as long as these filters are async, like the network and ssl filters 
are now, the http2 stuff can use write completion and be async too.

> These filters copy the data (or move the files) across the processing filters 
> thread into the h2 multiplexer (h2_mplx) where the master connection thread 
> reads it and send it out to the client, properly framed for HTTP/2.
> 
> Memory wise, master, multiplexer and slave connections have separate apr_pool 
> hierarchies, due to multi-threading issues with any other attempt to handle 
> it.
> - master connection, mpm assigned thread, pool provided by core
> - multiplexer, everything protected by mutex, child pools for ever h2 stream
> - slave connection, child pools of the h2_workers assigned to them
> 
> Due to the non-multithreadability of apr_buckets, no buckets are ever moved 
> across threads. non-meta buckets are read, meta buckets are deleted. That 
> should work fine for EOR buckets, as all data has been copied already when 
> they arrive.

A key part of the httpd v2.x design was to achieve zero copy - ideally we 
should be using bucket setaside to pass the bucket between pools rather than 
copying the buckets.

Can you explain "non-multithreadability of apr_buckets” in more detail? I take 
it this is the problem with passing a bucket from one allocator to another?

If so then the copy makes more sense.

> One special case is implemented for file buckets. If the number of already 
> open files is not "too high", apr_file_setaside() is used to have the file 
> handle cleanup registered at the stream pool, no longer at the slave 
> connection pool, and a new file bucket is written.
> 
> So, data/files can and will live long after the slave connection has gone 
> away and all its pools have been reclaimed. This is desired and even the 
> ideal case, as stream out a files can be done solely from the main 
> connection. This can interleave many streams using only a single thread.

This is what the various async MPMs do today.

What is attractive about this is releasing expensive backends early without 
them having to stick around waiting for frontends to eventually acknowledge a 
request.

> Stream pool destruction is synched with 
> 1. slave connection being done and no longer writing to it

How do you currently know the slave connection is done?

Normally a connection is cleaned up by the MPM that spawned the connection, I 
suspect you’ll need to replicate the same logic the MPMs use to tear down the 
connection using the c->aborted and c->keepalive flags.

Crucially the slave connection needs to tell you that it’s done. If you kill a 
connection early, data will be lost.

I suspect part of the problem is not implementing the algorithm that async MPMs 
used to kick filters with data in them. Without this kick, data in the slave 
stacks will never be sent. In theory, when the http2 filter receives a kick, it 
should pass the kick on to all slave connections.

> 2. h2 stream having been written out to the client or otherwise being closed
> Only after 1+2 happened will this memory be reclaimed.

In the case of the h2 stream you probably need to implement the same mechanism 
with c->aborted and c->keepalive so the MPM cleans up the h2 stream for you.

You would need to implement cleanups on the slave connections which would then 
mark the master h2 stream for cleanup by the MPM based on whether the number of 
slave connections has reached zero.

> Transient buckets are used heavily on the master connection, as the way data 
> buckets are being generated does not suit the coalescing ssl filter as it is 
> currently designed. Instead, each master connection has a max-size buffer 
> where frames are assembled and properly chunked into nicely sized transient 
> buckets for passing down the network filters.
> 
> This is h2 bucket/pool handling in a nutshell.

Makes sense.

Regards,
Graham
—

Reply via email to