New submission from STINNER Victor:
The _PyUnicodeWriter API avoids creation of temporary Unicode strings and has
very good performances to build Unicode strings with the PEP 393 (compact
unicode string).
Attached patch adds a _PyObject_ReprWriter() function to avoid creation of
tempory Unicode string while calling repr(obj) on containers like tuple, list
or dict.
I did something similar for str%args and str.format(args).
To avoid the following code, we might add something to PyTypeObject, maybe a
new tp_repr_writer field.
+ if (PyLong_CheckExact(v)) {
+ return _PyLong_FormatWriter(writer, v, 10, 0);
+ }
+ if (PyUnicode_CheckExact(v)) {
+ return _PyUnicode_ReprWriter(writer, v);
+ }
+ if (PyList_CheckExact(v)) {
+ return _PyList_ReprWriter(writer, v);
+ }
+ if (PyTuple_CheckExact(v)) {
+ return _PyTuple_ReprWriter(writer, v);
+ }
+ if (PyList_CheckExact(v)) {
+ return _PyList_ReprWriter(writer, v);
+ }
+ if (PyDict_CheckExact(v)) {
+ return _PyDict_ReprWriter(writer, v);
+ }
For example, repr(list(range(10))) ('[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]') should
only allocate one buffer of 37 bytes and then shink it to 30 bytes.
I guess that benchmarks are required to justify such changes.
----------
files: repr_writer.patch
keywords: patch
messages: 203371
nosy: haypo, serhiy.storchaka
priority: normal
severity: normal
status: open
title: Generalize usage of _PyUnicodeWriter for repr(obj): add
_PyObject_ReprWriter()
versions: Python 3.4
Added file: http://bugs.python.org/file32701/repr_writer.patch
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue19653>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com