On 28 April 2012 01:34, [email protected] <[email protected]> wrote:
> I search through mod_wsgi.c and apaches modules/http/http_filters.c
> and for me it looks like that the HTTP body send to the daemon is
> *free* from the chucking information. Which means no annoying (56d\r
> \n, 574\r\n and \r\n0\r\n\r\n the end).
> Inside the daemon the 'ap_http_filter' is added again to the input-
> chain and then called by ap_get_client_block when the application
> needs data. However in this scenario 'ap_http_filter' inside the
> daemon process has *no* Content-Length header (obvious because it is
> chunking) and *no* chunking information inside the body itself, like
> the trailing '0'
> 'ap_http_filter' has no chance and returns APR_EOF to
> ap_get_client_block. ap_get_client_block interprets APR_EOF as an
> error and returns -1 which is converted by mod_wsgi.c to the exception
> above.

Not quite. The filter doesn't get confused about lack of chunking in
body as Transfer-Encoding chunked header doesn't exist so it wasn't
expecting it in the first place. It quite happily reads all the
content, it is what happens after it has read all the content that is
the problem. Specifically, ap_get_client_block() is only designed to
work with a content length or with chunked. It cannot handle unbound
normal content with no length. Thus rather than see APR_EOF as end of
input it generates an error.

In practice the code shouldn't be using ap_get_client_block(). The
comments in it even say as much:

     /* We lose the failure code here.  This is why ap_get_client_block should
     * not be used.
     */

> Right now I see this solutions

A quick hack is to do:

            if (n == -1) {
                if (wsgi_daemon_process && self->r->read_chunked &&
                    self->r->connection->keepalive) {

                    /* Have exhausted all the available input data. */

                    self->done = 1;
                }
                else {
                    PyErr_SetString(PyExc_IOError, "request data read error");
                    Py_DECREF(result);
                    return NULL;
                }
            }
            else if (n == 0) {
                /* Have exhausted all the available input data. */

                self->done = 1;
            }

            length += n;

Next would be not to use ap_get_client_block(), duplicating what it
did and allowing it to detect the APR_EOF properly. That also
potentially means not using ap_setup_client_block and
ap_should_client_block() as well, at which point you are starting to
skip a lot of magic.

Whether HTTP_IN can be removed am not sure. I don't remember if there
was specific reason it was there for daemon. I was trying to cheat by
simulating the same filter stack in daemon so didn't have to have two
WSGI input implementations.

The last solution is to have daemon specific wrappers directly around
the socket connection and avoid all the Apache muck being in there.

Graham

> 1) Don't solve this at mod_wsgi level and write a separate
> apache_module which does the 'dechunking of its own'. This module
> reads the *whole* request at once (similar to mod_request in
> apache2.4)  then it can calculate the body size and replaces the
> 'Transfer-Encoding' header with a Content-Length.
>
> 2) Do it like above but integrate the code it to mod_wsgi itself. This
> code runs then before the request is send to the daemon.
>
> 3) Don't run ap_http_filter at all inside the daemon. I mean
> 'ap_http_filter' runs already in the process which accepts the
> request. So why execute it a second time in the daemon itself? To be
> more precise this line from 2007 makes me wondering
> 'ap_add_input_filter("HTTP_IN", NULL, r, r->connection);'
> I disabled this line just for fun and then it works, because the only
> filter called now is 'ap_core_input_filter' which return APR_SUCCESS
> if no more data is coming.
> On the other side I can image that there is a use case why this line
> makes sense, I don't simply see it.
>
> If there is a good reason not do option three I have no problem with
> the other ones, because the requests our application expects will not
> blow up memory.
>
> Regards,
>  Stephan
>
>
> --
> You received this message because you are subscribed to the Google Groups 
> "modwsgi" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group at 
> http://groups.google.com/group/modwsgi?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/modwsgi?hl=en.

Reply via email to