One of the biggest remaining performance problems in the httpd is the code that scans HTTP requests. In the current implementation, read_request_line() and ap_get_mime_headers() call ap_rgetline_core(), which has to: - create a temporary brigade - call ap_get_brigade() to get the next line of input from the core input filter, which in turn has to scan for LF and split the bucket - copy the content into a buffer, - destroy the temp brigade - call itself recursively in the (rare) folding case And all this happens for every line of the request header.
We're creating a ton of temporary brigades and temporary buckets in this process, plus registering and unregistering pool cleanup callbacks and doing at least one memcpy operation per line. I'd like to streamline this processing so that ap_read_request() can do something like this: - get the input brigade from the input filter stack - loop through the brigade and parse the HTTP header - Note: in the common case, the whole header will be in the first bucket - split the bucket where the request header ends, remove the bucket(s) containing the header from the brigade, and hand the remainder of the brigade back to the filter stack - Then use the bytes in the header bucket(s) as a read-write copy of the request. (In the case where the whole request header is in one bucket, we could achieve zero-copy input for all the header fields by null-terminating them in place and storing those strings in r->headers_in, as long as we ensure that the data lasts for the lifetime of the request.) What's the best way to do this? Turn the reading of the request header into an input filter? Or something more like the current implementation, but with an AP_MODE_SPECULATIVE read from the input filter stack to let ap_read_request() peek at the entire request? Or something else entirely? Thanks, --Brian