Bruce Merry <bme...@gmail.com> added the comment:

> A patch would not land in Python 3.9 since this would be a new feature and 
> out-of-scope for a released version.

I see it as a fix for this bug. While there is already a fix, it regresses 
another bug (bpo-36050), so this would be a better fix.

> Do you really want to store gigabytes of downloads in RAM instead of doing 
> chunked reads and store them on disk?

I work on HPC applications where large quantities of data are stored in an 
S3-compatible object store and fetched over HTTP at 25Gb/s for processing. The 
data access layer tries very hard to avoid even making extra copies in memory 
(which is what caused me to file bpo-36050 in the first place) as it make a 
significant difference at those speeds. Buffering to disk would be right out.

> then there are easier and better ways to deal with large buffers

Your example code is probably fine if one is working directly on an SSLSocket, 
but http.client wraps it in a buffered reader (via `socket.makefile`), and that 
implements `readinto` by reading into a temporary and copying it 
(https://github.com/python/cpython/blob/8d0647485db5af2a0f0929d6509479ca45f1281b/Modules/_io/bufferedio.c#L88),
 which would add overhead.

I appreciate that what I'm proposing is a relatively complex change for a 
released version. A less intrusive option would to be change MAXAMOUNT in 
http.client from 1MiB to 2GiB-1byte (as suggested by @matan1008). That would 
still leave 3.9 slower than 3.8 when reading >2GiB responses over plain HTTP, 
but at least everything in the range [1MiB, 2GiB) would operate at full speed 
(which is the region I actually care about).

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue42853>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to