> From: Brian Pane [mailto:[EMAIL PROTECTED]]
> 
> One of the biggest remaining performance problems in the httpd is
> the code that scans HTTP requests.  In the current implementation,
> read_request_line() and ap_get_mime_headers() call ap_rgetline_core(),
> which has to:
>    - create a temporary brigade
>    - call ap_get_brigade() to get the next line of input from the
>      core input filter, which in turn has to scan for LF and split
>      the bucket
>    - copy the content into a buffer,
>    - destroy the temp brigade
>    - call itself recursively in the (rare) folding case
> And all this happens for every line of the request header.

Have you looked at which of these are causing the most performance
problems?  It seems to me that the easiest thing to do, would be to use
a persistent brigade, which removes two steps from this.

The copying into a buffer is required, because you have no way of
knowing how many buckets were used in getting the header line from the
client.  The only way to solve this, is to do the copy in the HTTP_IN
filter.  That would work just fine, and would remove the copy in the
common case.  Then, in the getline() function, you can remove the memcpy
all together.  I don't really care about the recursive case, because is
almost never happens.

> We're creating a ton of temporary brigades and temporary buckets
> in this process, plus registering and unregistering pool cleanup
> callbacks and doing at least one memcpy operation per line.
> 
> I'd like to streamline this processing so that ap_read_request()
> can do something like this:
> 
>   - get the input brigade from the input filter stack
>   - loop through the brigade and parse the HTTP header
>     - Note: in the common case, the whole header will
>       be in the first bucket

No, please don't do this.  Currently, it is VERY easy to write a filter
that operates on the headers, because you can rely on the code
separating the headers into individual brigades.  I personally have a
filter that modifies the request line, and this model would make that
filter MUCH harder to write.

>   - split the bucket where the request header ends,
>     remove the bucket(s) containing the header from
>     the brigade, and hand the remainder of the brigade
>     back to the filter stack

This requires push-back, which has been discussed and vetoed, because it
adds a lot of complexity with very little gain.

Until we know where the performance problems are in the current model I
believe that it makes little sense to redesign the whole model.  I think
you can stream-line this by just re-using the same brigade for every
header and by moving the copy down to the HTTP_IN filter.

Ryan

Reply via email to