I just committed a new version of ap_core_output_filter() to the async-dev branch. The semantics of the filter have changed in a fundamental way: the original version does blocking writes and usually tries to write out all of the available data before writing (except in cases where it buffers data to avoid small writes), whereas this
new version does nonblocking writes in most cases.

The goal of the rewrite was to set the stage for a clean implementation of
asynchronous write completion in the Event and/or Leader/Followers MPMs.
The nonblocking behavior, however, appears to be useful in all MPMs. For
example, nonblocking writes can enable mod_include to parse the next
bunch of output while awaiting an ack from the client. To avoid infinite memory usage, the new filter does blocking writes if it has >= 64KB of data
buffered up.

There are some significant differences from the earlier nonblocking output filter that I posted a few weeks ago as part of the Event MPM async write
completion patch:

- If a nonblocking write attempt results in EAGAIN, the new filter returns
  APR_SUCCESS instead of APR_EAGAIN.  The old patch broke too much
existing code that wasn't prepared for EAGAIN. The new filter can return
  APR_SUCCESS without having actually written the entire brigade, but
that's also true of the original ap_core_output_filter(), with its various
  setaside cases.

- The new filter doesn't try to ignore flush buffers like the earlier patch did. Instead, when it encounters a flush bucket, it does a blocking write of everything up to that point. The earlier patch tried to detect certain
  patterns of buckets involving flush, EOS, and EOC that could be
interpreted as "hand this data off to the write completion thread instead of actually writing it out immediately." But that logic was too brittle, as
  it depended on knowledge of the bucket patterns that the httpd core
  just happens to produce.  In the new design, the core output filter
interprets a flush bucket as "write this data out before returning." To
  implement async write completion on top of this, we'll likely have to
  remove some of the points in the core that generate flush buffers--
  but that's a project for another day.

There are a few things missing in the new version:

- It doesn't concatenate sequences of really small buckets together the way the original does. If you send it a brigade containing 16 single- byte buckets, it will do a writev of 16 bytes. My inclination is to leave this "broken" in order to keep the code simple, unless someone has a real-world use case that
  produces such output.

- I haven't yet put in mod_logio support. This shouldn't be difficult to add, but I want to do some experiments with a new design: sending an End-Of- Request bucket that calls the request logger when all the buckets in front of it have been written to the network. If this works, the "EOR" bucket's destroy function
  might end up being the cleanest place to call the logio hooks.

- It doesn't yet do nonblocking reads on socket buckets. Can anyone recommend
  a good test case that make use of socket buckets?

Thanks,
Brian

Reply via email to