New submission from Ma Lin <[email protected]>:
BufferedReader's constructor has a `buffer_size` parameter, it's the size of
this buffer:
When reading data from BufferedReader object, a larger
amount of data may be requested from the underlying raw
stream, and kept in an internal buffer.
The doc of BufferedReader[1]
If call the BufferedReader.read(size) function:
1, When `size` is a positive number, it reads `buffer_size`
bytes from the underlying stream. This is expected behavior.
2, When `size` is -1, it tries to call underlying stream's
readall() function [2]. In this case `buffer_size` is not
be respected.
The underlying stream may be `RawIOBase`, its readall()
function read `DEFAULT_BUFFER_SIZE` bytes in each read [3].
`DEFAULT_BUFFER_SIZE` currently only 8KB, which is very
inefficient for BufferedReader.read(-1). If `buffer_size`
bytes is read every time, will be the expected performance.
Attached file demonstrates this problem.
[1] doc of BufferedReader:
https://docs.python.org/3/library/io.html#io.BufferedReader
[2] BufferedReader.read(-1) tries to call underlying stream's readall()
function:
https://github.com/python/cpython/blob/v3.9.0b5/Modules/_io/bufferedio.c#L1538-L1542
[3] RawIOBase.readall() read DEFAULT_BUFFER_SIZE each time:
https://github.com/python/cpython/blob/v3.9.0b5/Modules/_io/iobase.c#L968-L969
----------
components: IO
files: demo.py
messages: 374652
nosy: malin
priority: normal
severity: normal
status: open
title: Inefficient BufferedReader.read(-1)
type: performance
versions: Python 3.10
Added file: https://bugs.python.org/file49354/demo.py
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue41452>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com