#31564: Django fails to return HttpResponse message on early response with large
uploads
-------------------------------------+-------------------------------------
Reporter: Jacob | Owner: nobody
Crabtree |
Type: | Status: new
Uncategorized |
Component: File | Version: 3.0
uploads/storage | Keywords: Memory Error, Large
Severity: Normal | File Upload, HTTP 414
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
== Short Description:
I am trying to create a Django API that allows people to upload large
(think >1GB files) to the server. This API will sometimes need to stop the
upload early, and return a response to the client. I have discovered that
Django will fail to return the response message if the upload is
sufficiently large. Instead, it will return a generic Html template
stating the error code to the client. On the Django server, it attempts to
read *all* of the data left in the pipeline and interpret it as a URI (at
least, as far as I can tell), and will then throw a 414 Request URI too
long error.
This error does not occur with smaller files. I haven't found a lower
limit, but 1GB triggers it fairly reliably. You may want to make it a
larger file if your computer has a lot of RAM, or the error is otherwise
not occurring.
Versions: Django v3.0.4, PycURL v7.43.0.2, and Python 3.6.4
== Example Django Server to reproduce
{{{
from django.http import HttpResponse
from django.views.decorators.csrf import csrf_exempt
from django.views.decorators.http import require_POST
@csrf_exempt
@require_POST
def upload(request):
print('got request')
read_size = 1024 * 1024 * 50
data = request.read(read_size)
return HttpResponse("Early Response!", status_code=500)
}}}
This server has one endpoint at /upload, and just reads one chunk of data
and then returns an early response. Also, in case it matters, in the
Django settings file I have added the following line to allow for these
large uploads:
{{{DATA_UPLOAD_MAX_MEMORY_SIZE = None}}}
== Example upload code
{{{
import os
import pycurl
from io import BytesIO
def config_pycurl(url, file_handle, file_size, response_buffer):
# Create the curl object, set the URL and HTTP method
c = pycurl.Curl()
c.setopt(pycurl.VERBOSE, True)
c.setopt(pycurl.URL, url)
c.setopt(pycurl.POST, 1)
# Give curl a buffer to write the remote server's response to
c.setopt(pycurl.WRITEDATA, response_buffer)
c.setopt(pycurl.SSL_VERIFYPEER, 0)
c.setopt(pycurl.SSL_VERIFYHOST, 0)
c.setopt(pycurl.POSTFIELDSIZE, file_size)
c.setopt(pycurl.READFUNCTION, file_handle.read)
return c
def send_pycurl(url, file_path, file_size):
resp_buffer = BytesIO()
input_file = open(file_path, 'rb')
c = config_pycurl(url, input_file, file_size, resp_buffer)
c.perform()
# Get HTTP response code, clean up handles
resp_code = c.getinfo(c.RESPONSE_CODE)
c.close()
input_file.close()
# Get response from the server - iso-8859-1 is the default encoding
curl performs
resp_body = resp_buffer.getvalue().decode('iso-8859-1')
print(f'code: {resp_code}')
print(f'response: {resp_body}')
if __name__ == '__main__':
post_url = 'http://localhost:8000/upload'
name = 'test_files/1_gigabyte_file.txt'
send_pycurl(post_url, name, os.path.getsize(name))
}}}
== Result
After returning HTTPResponse, Django prints the following stack trace:
{{{
Traceback (most recent call last):
File "c:\python\3.6.4\Lib\wsgiref\handlers.py", line 138, in run
self.finish_response()
File "c:\python\3.6.4\Lib\wsgiref\handlers.py", line 183, in
finish_response
self.close()
File "c:\.virtualenv\djangoError\lib\site-
packages\django\core\servers\basehttp.py", line 113, in close
self.get_stdin()._read_limited()
File "c:\.virtualenv\djangoError\lib\site-
packages\django\core\handlers\wsgi.py", line 28, in _read_limited
result = self.stream.read(size)
MemoryError
[10/May/2020 19:32:23] code 414, message Request-URI Too Long
[10/May/2020 19:32:23] "" 414 -
----------------------------------------
Exception happened during processing of request from ('127.0.0.1', 63835)
Traceback (most recent call last):
File "c:\python\3.6.4\Lib\socketserver.py", line 639, in
process_request_thread
self.finish_request(request, client_address)
File "c:\python\3.6.4\Lib\socketserver.py", line 361, in finish_request
self.RequestHandlerClass(request, client_address, self)
File "c:\python\3.6.4\Lib\socketserver.py", line 696, in __init__
self.handle()
File "c:\.virtualenv\djangoError\lib\site-
packages\django\core\servers\basehttp.py", line 174, in handle
self.handle_one_request()
File "c:\.virtualenv\djangoError\lib\site-
packages\django\core\servers\basehttp.py", line 187, in handle_one_request
self.send_error(414)
File "c:\python\3.6.4\Lib\http\server.py", line 473, in send_error
self.wfile.write(body)
File "c:\python\3.6.4\Lib\socketserver.py", line 775, in write
self._sock.sendall(b)
ConnectionResetError: [WinError 10054] An existing connection was forcibly
closed by the remote host
}}}
On the PycURL side, I see the following from its verbose output:
{{{
* Trying ::1...
* TCP_NODELAY set
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8000 (#0)
> POST /upload HTTP/1.1
Host: localhost:8000
User-Agent: PycURL/7.43.0.2 libcurl/7.60.0 OpenSSL/1.1.0h zlib/1.2.11
c-ares/1.14.0 WinIDN libssh2/1.8.0 nghttp2/1.32.0
Accept: */*
Content-Length: 1073741825
Content-Type: application/x-www-form-urlencoded
Expect: 100-continue
< HTTP/1.1 100 Continue
< HTTP/1.1 500 Internal Server Error
< Date: Sun, 10 May 2020 23:32:23 GMT
< Server: WSGIServer/0.2 CPython/3.6.4
code: 500
< Content-Type: text/html
< X-Frame-Options: DENY
response:
< Content-Length: 145
<!doctype html>
< Vary: Cookie
<html lang="en">
< X-Content-Type-Options: nosniff
<head>
* HTTP error before end of send, stop sending
<title>Server Error (500)</title>
<
</head>
<body>
<h1>Server Error (500)</h1><p></p>
</body>
</html>
* Closing connection 0
}}}
== Expected Result
I would expect Django to respond with a 500 Server Error. Additionally, it
should have no memory errors. It appears from the stack trace that Django
is attempting to read as much as Content-Length originally advertised, and
then throws an error when the OS refuses to allocate that much memory. The
414 error is also not expected.
On PycURL's side, the response code variable should be 500, and the
response body should be "Early Response!", the string passed to Django's
HTTPResponse.
Please let me know if any additional information is required, this is my
first time submitting a bug here! I have also attached the above scripts
as files so it's hopefully easier to set up and reproduce.
--
Ticket URL: <https://code.djangoproject.com/ticket/31564>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-updates/052.2a83d8deb055dea9bbbef850d6e0b06e%40djangoproject.com.