Re: [Django] #35838: request.read() returns empty for Requests w/ Transfer-Encoding: Chunked

Django Mon, 14 Oct 2024 09:11:05 -0700

#35838: request.read() returns empty for Requests w/ Transfer-Encoding: Chunked
------------------------------------+--------------------------------------
     Reporter:  Klaas van Schelven  |                    Owner:  (none)
         Type:  Uncategorized       |                   Status:  new
    Component:  HTTP handling       |                  Version:  5.0
     Severity:  Normal              |               Resolution:
     Keywords:                      |             Triage Stage:  Unreviewed
    Has patch:  0                   |      Needs documentation:  0
  Needs tests:  0                   |  Patch needs improvement:  0
Easy pickings:  0                   |                    UI/UX:  0
------------------------------------+--------------------------------------
Description changed by Klaas van Schelven:


Old description:

> Django's request.read() returns 0 bytes when there's no Content-Length
> header.
> i.e. it silently fails.
>
> But not having a Content-Length header is perfectly fine when there's a
> HTTP/1.1 Transfer-Encoding: Chunked request.
>
> WSGI servers like gunicorn and mod_wsgi are able to handle this just
> fine, i.e. Gunicorn handles the hexidecimally encoded lengths and just
> passes you the right chunks and Apache's mod_wsgi does the same I
> believe.
>
> Discussions/docs over at Gunicorn / mod_wsgi:
>
> * https://github.com/benoitc/gunicorn/issues/1264
> * https://github.com/benoitc/gunicorn/issues/605
> * https://github.com/benoitc/gunicorn/issues/2947
> * https://modwsgi.readthedocs.io/en/develop/configuration-
> directives/WSGIChunkedRequest.html
>
> The actual single line of code that's problematic is this one:
> https://github.com/django/django/blob/97c05a64ca87253e9789ebaab4b6d20a1b2370cf/django/core/handlers/wsgi.py#L77
>
> My personal reason I ran into this:
>
> * https://github.com/bugsink/bugsink/issues/9
>
> which is basically solved by using a lot of lines to undo the effects of
> that single line of code:
>
> {{{
> import django
> from django.core.handlers.wsgi import WSGIHandler, WSGIRequest
>
> os.environ.setdefault('DJANGO_SETTINGS_MODULE', '....
>

> class MyWSGIRequest(WSGIRequest):
>
>     def __init__(self, environ):
>         super().__init__(environ)
>
>         if "CONTENT_LENGTH" not in environ and "HTTP_TRANSFER_ENCODING"
> in environ:
>             # "unlimit" content length
>             self._stream = self.environ["wsgi.input"]
>

> class MyWSGIHandler(WSGIHandler):
>     request_class = MyWSGIRequest
>

> def my_get_wsgi_application():
>     # Like get_wsgi_application, but returns a subclass of WSGIHandler
> that uses a custom request class.
>     django.setup(set_prefix=False)
>     return MyWSGIHandler()
>

> application = my_get_wsgi_application()
>
> }}}
>
> But I'd rather not solve this in my own application only, but have it be
> a Django thing. In the Gunicorn links, there's some allusion to "this
> won't happen b/c wsgi spec", but that seems like a bad reason from my
> perspective. At the very least request.read() should not just silently
> give a 0-length answer. And having tools available such that you don't
> need to make a "MyXXX" hierarchy would also be nice.
>
> That's the bit for "actually getting request.read() to work when behind
> gunicorn".
>
> There's also the part where this doesn't work for the debugserver. Which
> is fine, given its limited scope. But an error would be better than
> silently returning nothing (again).
>
> My solution for that one is the following middleware:
>

> {{{
> class DisallowChunkedMiddleware:
>     def __init__(self, get_response):
>         self.get_response = get_response
>
>     def __call__(self, request):
>         if request.META.get("HTTP_TRANSFER_ENCODING").lower() ==
> "chunked" and \
>                 not request.META.get("wsgi.input_terminated"):
>
>             # If we get here, it means that the request has a Transfer-
> Encoding header with a value of "chunked", but we
>             # have no wsgi-layer handling for that. This probably means
> that we're running the Django development
>             # server, and as such our fixes for the Gunicorn/Django
> mismatch that we put in wsgi.py are not in effect.
>             raise ValueError("This server is not configured to support
> Chunked Transfer Encoding (for requests)")
>
>         return self.get_response(request)
> }}}
>

> Some links:
>
> * This one seemed related, but probably isn't (it's about forms):
> https://code.djangoproject.com/ticket/35289
> * This seemed related, but is about uwsgi and uses special uwsgi
> features: https://github.com/btimby/uwsgi-
> chunked/blob/master/uwsgi_chunked/chunked.py
>
> All of the above is when using WSGI (I did not test/think about ASGI)

New description:

 Django's request.read() returns 0 bytes when there's no Content-Length
 header.
 i.e. it silently fails.

 But not having a Content-Length header is perfectly fine when there's a
 HTTP/1.1 Transfer-Encoding: Chunked request.

 WSGI servers like gunicorn and mod_wsgi are able to handle this just fine,
 i.e. Gunicorn handles the hexidecimally encoded lengths and just passes
 you the right chunks and Apache's mod_wsgi does the same I believe.

 Discussions/docs over at Gunicorn / mod_wsgi:

 * https://github.com/benoitc/gunicorn/issues/1264
 * https://github.com/benoitc/gunicorn/issues/605
 * https://github.com/benoitc/gunicorn/issues/2947
 * https://modwsgi.readthedocs.io/en/develop/configuration-
 directives/WSGIChunkedRequest.html

 The actual single line of code that's problematic is this one:
 
https://github.com/django/django/blob/97c05a64ca87253e9789ebaab4b6d20a1b2370cf/django/core/handlers/wsgi.py#L77

 My personal reason I ran into this:

 * https://github.com/bugsink/bugsink/issues/9

 which is basically solved by using a lot of lines to undo the effects of
 that single line of code:

 {{{
 import django
 from django.core.handlers.wsgi import WSGIHandler, WSGIRequest

 os.environ.setdefault('DJANGO_SETTINGS_MODULE', '....


 class MyWSGIRequest(WSGIRequest):

     def __init__(self, environ):
         super().__init__(environ)

         if "CONTENT_LENGTH" not in environ and "HTTP_TRANSFER_ENCODING" in
 environ:
             # "unlimit" content length
             self._stream = self.environ["wsgi.input"]


 class MyWSGIHandler(WSGIHandler):
     request_class = MyWSGIRequest


 def my_get_wsgi_application():
     # Like get_wsgi_application, but returns a subclass of WSGIHandler
 that uses a custom request class.
     django.setup(set_prefix=False)
     return MyWSGIHandler()


 application = my_get_wsgi_application()

 }}}

 But I'd rather not solve this in my own application only, but have it be a
 Django thing. In the Gunicorn links, there's some allusion to "this won't
 happen b/c wsgi spec", but that seems like a bad reason from my
 perspective. At the very least request.read() should not just silently
 give a 0-length answer. And having tools available such that you don't
 need to make a "MyXXX" hierarchy would also be nice.

 That's the bit for "actually getting request.read() to work when behind
 gunicorn".

 There's also the part where this doesn't work for the debugserver. Which
 is fine, given its limited scope. But an error would be better than
 silently returning nothing (again).

 My solution for that one is the following middleware:


 {{{
 class DisallowChunkedMiddleware:
     def __init__(self, get_response):
         self.get_response = get_response

     def __call__(self, request):
         if "HTTP_TRANSFER_ENCODING" in request.META and \
                 request.META["HTTP_TRANSFER_ENCODING"].lower() ==
 "chunked" and \
                 not request.META.get("wsgi.input_terminated"):

             # If we get here, it means that the request has a Transfer-
 Encoding header with a value of "chunked", but we
             # have no wsgi-layer handling for that. This probably means
 that we're running the Django development
             # server, and as such our fixes for the Gunicorn/Django
 mismatch that we put in wsgi.py are not in effect.
             raise ValueError("This server is not configured to support
 Chunked Transfer Encoding (for requests)")

         return self.get_response(request)
 }}}


 Some links:

 * This one seemed related, but probably isn't (it's about forms):
 https://code.djangoproject.com/ticket/35289
 * This seemed related, but is about uwsgi and uses special uwsgi features:
 https://github.com/btimby/uwsgi-
 chunked/blob/master/uwsgi_chunked/chunked.py

 All of the above is when using WSGI (I did not test/think about ASGI)

--
-- 
Ticket URL: <https://code.djangoproject.com/ticket/35838#comment:3>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/010701928bcbeab2-33380513-34e0-4f1a-b7cc-db69e7d3fd06-000000%40eu-central-1.amazonses.com.

Re: [Django] #35838: request.read() returns empty for Requests w/ Transfer-Encoding: Chunked

Reply via email to