On 13/04/07, Graham Dumpleton <[EMAIL PROTECTED]> wrote:
On 13/04/07, Arturo 'Buanzo' Busleiman <[EMAIL PROTECTED]> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
>
> Hi group!
>
> For mod_auth_openpgp I need to read the POST body. During my research 
(googling, archives of this
> list, apache.org, etc) I discovered three methods so far. I would like your 
opinions on the safest
> one, fastest one, if should DECHUNK, how much to allow for post size 
allocation (probably a
> configuration option, but i'd need a default value...).
>
> This is what I got: anything you can think of would be of GREAT help:
>
> Getting REQUEST BODY: (1)
> ============================
>
>         ap_setup_client_block(r, REQUEST_CHUNKED_DECHUNK);
>
>         char buffer[1024];
>
>         if ( ap_should_client_block(r) == 1 ) {
>                 while ( ap_get_client_block(r, buffer, 1024) > 0 ) {
>                         ap_rputs("Reading in buffer...<br>",r);
>                         ap_rputs(buffer,r);
>                 }
>         } else {
>                 ap_rputs("Nothing to read...<br>",r);
>         }

I can't find reference to point at so my memory could be wrong, but if
using this approach one thing you must be mindful of is that the
minimum size you use for the read buffer must be sufficient to hold
any chunk size information and any trailers provided after the last
null chunk when chunked transfer encoding is being used. This is
because the HTTP filter code uses your buffer as working space for
decoding those parts of the request stream.

Found the documentation about this. Was just looking at the wrong
place in the source code comment. What it says is:

* 1. Call ap_setup_client_block() near the beginning of the request
*    handler. This will set up all the necessary properties, and will
*    return either OK, or an error code. If the latter, the module should
*    return that error code. The second parameter selects the policy to
*    apply if the request message indicates a body, and how a chunked
*    transfer-coding should be interpreted. Choose one of
*
*    REQUEST_NO_BODY          Send 413 error if message has any body
*    REQUEST_CHUNKED_ERROR    Send 411 error if body without Content-Length
*    REQUEST_CHUNKED_DECHUNK  If chunked, remove the chunks for me.
*    REQUEST_CHUNKED_PASS     If chunked, pass the chunk headers with body.
*
*    In order to use the last two options, the caller MUST provide a buffer
*    large enough to hold a chunk-size line, including any extensions.

Note the last line where it says:

"""caller MUST provide a buffer large enough to hold a chunk-size
line, including any extensions."""

I think I am slightly wrong on one point though when I said 'trailers'
as that is something different to 'extensions'. RFC says:

      Chunked-Body   = *chunk
                       last-chunk
                       trailer
                       CRLF

      chunk          = chunk-size [ chunk-extension ] CRLF
                       chunk-data CRLF
      chunk-size     = 1*HEX
      last-chunk     = 1*("0") [ chunk-extension ] CRLF

      chunk-extension= *( ";" chunk-ext-name [ "=" chunk-ext-val ] )
      chunk-ext-name = token
      chunk-ext-val  = token | quoted-string
      chunk-data     = chunk-size(OCTET)
      trailer        = *(entity-header CRLF)

Even so, how long is an extension going to be and therefore what is
the minimum buffer size one should use. This is something I have never
found a good answer to as examples of 'extensions' are hard to find.
The only thing I have found so far that defines an 'extension' is ICAP
protocol (RFC 3507). It defines:

     \r\n
     0; ieof\r\n\r\n

Thus it is short.

Overall I really don't know how big a deal this is. If you are always
controlling the buffer size used it probably isn't, but if specified
by higher up code in some scripting language wrapper then you have to
be mindful of it and always use a bigger buffer but still only return
what higher up code asks for and cache any extra for next read.

Am I just being overly cautious? Anyone else have any comments to make
about this issue with chunked data?

Graham

Reply via email to